Values Disappear when Filtering Correlated Subquery - sql

This question is related to the recent answer I provided here.
Setup
Using MS Access 2007.
Assume I have a table called mytable consisting of three fields:
id Long Integer AutoNumber (PK)
type Text
num Long Integer
With the following sample data:
+----+------+-----+
| id | type | num |
+----+------+-----+
| 1 | A | 10 |
| 2 | A | 20 |
| 3 | A | 30 |
| 4 | B | 40 |
| 5 | B | 50 |
| 6 | B | 60 |
| 7 | C | 70 |
| 8 | C | 80 |
| 9 | C | 90 |
| 10 | D | 100 |
+----+------+-----+
Similar to the linked answer, say I wish to output the three fields, with a running total for each type value, with the value of the running total limited to a maximum of 100, I might use a correlated subquery such as the following:
select q.* from
(
select t.id, t.type, t.num,
(
select sum(u.num)
from mytable u where u.type = t.type and u.id <= t.id
) as rt
from mytable t
) q
where q.rt < 100
This produces the expected result:
+----+------+-----+----+
| id | type | num | rt |
+----+------+-----+----+
| 1 | A | 10 | 10 |
| 2 | A | 20 | 30 |
| 3 | A | 30 | 60 |
| 4 | B | 40 | 40 |
| 5 | B | 50 | 90 |
| 7 | C | 70 | 70 |
+----+------+-----+----+
Observation
Now assume that I wish to filter the result to show only those values for type like "[AB]".
If I use either of the following queries:
select q.* from
(
select t.id, t.type, t.num,
(
select sum(u.num)
from mytable u where u.type = t.type and u.id <= t.id
) as rt
from mytable t
where t.type like "[AB]"
) q
where q.rt < 100
select q.* from
(
select t.id, t.type, t.num,
(
select sum(u.num)
from mytable u where u.type = t.type and u.id <= t.id
) as rt
from mytable t
) q
where q.rt < 100 and q.type like "[AB]"
The results are filtered as expected, but the values in the rt (running total) column disappear:
+----+------+-----+----+
| id | type | num | rt |
+----+------+-----+----+
| 1 | A | 10 | |
| 2 | A | 20 | |
| 3 | A | 30 | |
| 4 | B | 40 | |
| 5 | B | 50 | |
+----+------+-----+----+
Question
Why would the filter cause the values returned by the correlated subquery to disappear?
Thank you for your time reading my question and in advance for any advice you can offer.

Moving type criteria to the aggregate subquery works.
One less tier works but the aggregate subquery has to repeat in WHERE clause:
SELECT mytable.*, (select sum(u.num)
from mytable u where u.type = MyTable.type and u.id <= MyTable.id
) AS rt
FROM mytable
WHERE ((((select sum(u.num)
from mytable u where u.type = MyTable.type and u.id <= MyTable.id
))<100) AND ((mytable.[type]) Like "[AB]"));
An INNER JOIN version:
select MyTable.*, q.* from MyTable INNER JOIN
(
select t.id, t.type, t.num,
(
select sum(u.num)
from mytable u where u.type = t.type and u.id <= t.id
) as rt
from mytable t
) q
ON q.id=MyTable.ID
where q.rt < 100 AND MyTable.Type LIKE "[AB]";

Related

Postgresql left join

I have two tables cars and usage. I create a record in usage once a month for some of cars.
Now I want to get distinct list of cars with their latest usage that I saved.
first of all look at the tables please
cars:
| id | model | reseller_id |
|----|-------------|-------------|
| 1 | Samand Sall | 324228 |
| 2 | Saba 141 | 92933 |
usages:
| id | car_id | year | month | gas |
|----|--------|------|-------|-----|
| 1 | 2 | 2020 | 2 | 68 |
| 2 | 2 | 2020 | 3 | 94 |
| 3 | 2 | 2020 | 4 | 33 |
| 4 | 2 | 2020 | 5 | 12 |
The problem is here
I need only the latest usage of year and month
I tried a lot of ways but none of them is good enough. because sometimes this query gets me one ofnot latest records of usages.
SELECT * FROM cars AS c
LEFT JOIN
(select *
from usages
) u on (c.id = u.car_id)
order by u.gas desc
You can do this with a DISTINCT ON in the derived table:
SELECT *
FROM cars AS c
LEFT JOIN (
select distinct on (u.car_id) *
from usages u
order by u.car_id, u.year desc, u.month desc
) lu on c.id = lu.car_id
order by u.gas desc;
I think you need window function row_number. Here is the demo.
select
id,
model,
reseller_id
from
(
select
c.id,
model,
reseller_id,
row_number() over (partition by u.car_id order by u.id desc) as rn
from cars c
left join usages u
on c.id = u.car_id
) subq
where rn = 1

Impala - Does impala allow multi GROUP_CONCAT in one query

For example, I have a table below
+-----------+-------+------------+
| Id | a| b|
+-----------+-------+------------+
| 1 | 6 | 20 |
| 1 | 4 | 55 |
| 1 | 9 | 56 |
| 1 | 2 | 67 |
| 1 | 7 | 80 |
| 1 | 5 | 66 |
| 1 | 3 | 33 |
| 1 | 8 | 34 |
| 1 | 1 | 52 |
I want the output would be like below by using Impala
+-----------+-------------------+-----------------------------+
| Id | a | b |
+-----------+-------------------+-----------------------------+
| 1 | 6,4,9,2,7,5,3,8,1 | 20,55,56,67,80,66,33,34,52 |
+-----------+-------------------+-----------------------------+
In Impala, I have used
SELECT Id,
group_concat(DISTINCT a) AS a,
group_concat(DISTINCT b) AS b
FROM table GROUP BY Id
It will always get Syntax error. Just wondering is that we are not allowed to use multi group_concat for one query in Impala? or not allow to use multi Distinct for one query?
From the documentation for GROUP_CONCAT:
You cannot apply the DISTINCT operator to the argument of this function.
But, as workaround, we can use two separate subqueries to find the distinct values:
WITH cte1 AS (
SELECT Id, GROUP_CONCAT(a) AS a
FROM (SELECT DISTINCT Id, a FROM yourTable) t
GROUP BY Id
),
cte2 AS (
SELECT Id, GROUP_CONCAT(b) AS b
FROM (SELECT DISTINCT Id, b FROM yourTable) t
GROUP BY Id
)
SELECT
t1.Id,
t1.a,
t2.b
FROM cte1 t1
INNER JOIN cte2 t2
ON t1.Id = t2.Id;

Finding nth row using sql

select top 20 *
from dbo.DUTs D
inner join dbo.Statuses S on d.StatusID = s.StatusID
where s.Description = 'Active'
Above SQL Query returns the top 20 rows, how can I get a nth row from the result of the above query? I looked at previous posts on finding the nth row and was not clear to use it for my purpose.
Thanks.
The row order is arbitrary, so I would add an ORDER BY expression. Then, you can do something like this:
SELECT TOP 1 * FROM (SELECT TOP 20 * FROM ... ORDER BY d.StatusID) AS d ORDER BY d.StatusID DESC
to get the 20th row.
You can also use OFFSET like:
SELECT * FROM ... ORDER BY d.StatusID OFFSET 19 ROWS FETCH NEXT 1 ROWS ONLY
And a third option:
SELECT * FROM (SELECT *, rownum = ROW_NUMBER() OVER (ORDER BY d.StatusID) FROM ...) AS a WHERE rownum = 20
I tend to use CTEs with the ROW_NUMBER() function to get my lists numbered in order. As #zambonee said, you'll need an ORDER BY clause either way or SQL can put them in a different order every time. It doesn't usually, but without ordering it yourself, you're not guaranteed to get the same thing twice. Here I'm assuming there's a [DateCreated] field (DATETIME NOT NULL DEFAULT GETDATE()), which is usually a good idea so you know when that record was entered. This says "give me everything in that table and add a row number with the most recent record as #1":
; WITH AllDUTs
AS (
SELECT *
, DateCreatedRank = ROW_NUMBER() OVER(ORDER BY [DateCreated] DESC)
FROM dbo.DUTs D
INNER JOIN dbo.Statuses S ON D.StatusID = S.StatusID
WHERE S.Description = 'Active'
)
SELECT *
FROM AllDUTs
WHERE AllDUTs.DateCreatedRank = 20;
SELECT * FROM (SELECT * FROM EMP ORDER BY ROWID DESC) WHERE ROWNUM<11
It's another sample:
SELECT * ,CASE WHEN COUNT(0)OVER() =ROW_NUMBER()OVER(ORDER BY number) THEN 1 ELSE 0 END IsNth
FROM (
select top 10 *
from master.dbo.spt_values AS d
where d.type='P'
) AS t
+------+--------+------+-----+------+--------+-------+
| name | number | type | low | high | status | IsNth |
+------+--------+------+-----+------+--------+-------+
| NULL | 0 | P | 1 | 1 | 0 | 0 |
| NULL | 1 | P | 1 | 2 | 0 | 0 |
| NULL | 2 | P | 1 | 4 | 0 | 0 |
| NULL | 3 | P | 1 | 8 | 0 | 0 |
| NULL | 4 | P | 1 | 16 | 0 | 0 |
| NULL | 5 | P | 1 | 32 | 0 | 0 |
| NULL | 6 | P | 1 | 64 | 0 | 0 |
| NULL | 7 | P | 1 | 128 | 0 | 0 |
| NULL | 8 | P | 2 | 1 | 0 | 0 |
| NULL | 9 | P | 2 | 2 | 0 | 1 |
+------+--------+------+-----+------+--------+-------+

I need a specific output

I have to get a specific output format from my tables.
Let's say I have a simple table with 2 columns name and value.
table T1
+---------------+------------------+
| Name | Value |
+---------------+------------------+
| stuff1 | 1 |
| stuff1 | 1 |
| stuff2 | 2 |
| stuff3 | 1 |
| stuff2 | 4 |
| stuff2 | 2 |
| stuff3 | 4 |
+---------------+------------------+
I know the values are in the interval 1-4. I group it by name and value and count number of the same rows as Number and get the following table:
table T2
+---------------+------------------+--------+
| Name | Value | Number |
+---------------+------------------+--------+
| stuff1 | 1 | 2 |
| stuff2 | 2 | 2 |
| stuff3 | 1 | 1 |
| stuff3 | 4 | 1 |
+---------------+------------------+--------+
Here is the part when I need your help! What should I do if I want to get these format?
table T3
+---------------+------------------+--------+
| Name | Value | Number |
+---------------+------------------+--------+
| stuff1 | 1 | 2 |
| stuff1 | 2 | 0 |
| stuff1 | 3 | 0 |
| stuff1 | 4 | 0 |
| stuff2 | 1 | 0 |
| stuff2 | 2 | 2 |
| stuff2 | 3 | 0 |
| stuff2 | 4 | 0 |
| stuff3 | 1 | 1 |
| stuff3 | 2 | 0 |
| stuff3 | 3 | 0 |
| stuff3 | 4 | 1 |
+---------------+------------------+--------+
Thanks for any suggestions!
You start with a cross join to generate all possible combinations and then left-join in the results from your existing query:
select n.name, v.value, coalesce(nv.cnt, 0) as "Number"
from (select distinct name from table t) n cross join
(select distinct value from table t) v left outer join
(select name, value, count(*) as cnt
from table t
group by name, value
) nv
on nv.name = n.name and nv.value = v.value;
Variation on the theme.
Differences between Gordon Linoff and Owen existing answers.
I prefer GROUP BY to get the Names rather than a DISTINCT. This may have better performance in a case like this. (See Rob Farley's still relevant article.)
I explode the subqueries into a series of CTEs for clarity.
I use table T2 as the question now labels the group results set instead of showing that as as subquery.
WITH PossibleValue AS (
SELECT 1 Value
UNION ALL
SELECT Value + 1
FROM PossibleValue
WHERE Value < 4
),
Name AS (
SELECT Name
FROM T1
GROUP BY Name
),
NameValue AS (
SELECT Name
,Value
FROM Name
CROSS JOIN
PossibleValue
)
SELECT nv.Name
,nv.Value
,ISNULL(T2.Number,0) Number
FROM NameValue nv
LEFT JOIN
T2 ON nv.Name = T2.Name
AND nv.Value = T2.Value
Yet another solution, this time using a Table Value Constructor in a CTE to build a table of name value combinations.
WITH value AS
( SELECT DISTINCT t.name, v.value
FROM T1 AS t
CROSS JOIN (VALUES (1),(2),(3),(4)) AS v (value)
)
SELECT v.name AS 'Name', v.value AS 'Value', COUNT(t.name) AS 'Number'
FROM value AS v
LEFT JOIN T1 AS t ON t.value = v.value AND t.name = v.name
GROUP BY v.name, v.value, t.name;

order by after full outer join

I create the following table on http://sqlfiddle.com in PostgreSQL 9.3.1 mode:
CREATE TABLE t
(
id serial primary key,
m varchar(1),
d varchar(1),
c int
);
INSERT INTO t
(m, d, c)
VALUES
('A', '1', 101),
('A', '2', 102),
('A', '3', 103),
('B', '1', 104),
('B', '3', 105);
table:
| ID | M | D | C |
|----|---|---|-----|
| 1 | A | 1 | 101 |
| 2 | A | 2 | 102 |
| 3 | A | 3 | 103 |
| 4 | B | 1 | 104 |
| 5 | B | 3 | 105 |
From this I want to generate such a table:
| M | D | ID | C |
|---|---|--------|--------|
| A | 1 | 1 | 101 |
| A | 2 | 2 | 102 |
| A | 3 | 3 | 103 |
| B | 1 | 4 | 104 |
| B | 2 | (null) | (null) |
| B | 3 | 5 | 105 |
but with my current statement
select * from
(select * from
(select distinct m from t) as dummy1,
(select distinct d from t) as dummy2) as combi
full outer join
t
on combi.d = t.d and combi.m = t.m
I only get the following
| M | D | ID | C |
|---|---|--------|--------|
| A | 1 | 1 | 101 |
| B | 1 | 4 | 104 |
| A | 2 | 2 | 102 |
| A | 3 | 3 | 103 |
| B | 3 | 5 | 105 |
| B | 2 | (null) | (null) |
Attempts to order it by m,d fail so far:
select * from
(select * from
(select * from
(select * from
(select distinct m from t) as dummy1,
(select distinct d from t) as dummy2) as kombi
full outer join
t
on kombi.d = t.d and kombi.m = t.m) as result)
order by result.m
Error message:
ERROR: subquery in FROM must have an alias: select * from (select * from (select * from (select * from (select distinct m from t) as dummy1, (select distinct d from t) as dummy2) as kombi full outer join t on kombi.d = t.d and kombi.m = t.m) as result) order by result.m
It would be cool if somebody could point out to me what I am doing wrong and perhaps show the correct statement.
select * from
(select kombi.m, kombi.d, t.id, t.c from
(select * from
(select distinct m from t) as dummy1,
(select distinct d from t) as dummy2) as kombi
full outer join t
on kombi.d = t.d and kombi.m = t.m) as result
order by result.m, result.d
I think your problem is the order. You can solve this problem with the order by clause:
select * from
(select * from
(select distinct m from t) as dummy1,
(select distinct d from t) as dummy2) as combi
full outer join
t
on combi.d = t.d and combi.m = t.m
order by combi.m, combi.d
You need to specify which data you would like to order. In this case you get back the row from the combi table, so you need to say that.
http://sqlfiddle.com/#!15/ddc0e/17
You could also use column numbers instead of names to do the ordering.
select * from
(select * from
(select distinct m from t) as dummy1,
(select distinct d from t) as dummy2) as combi
full outer join
t
on combi.d = t.d and combi.m = t.m
order by 1,2;
| M | D | ID | C |
|---|---|--------|--------|
| A | 1 | 1 | 101 |
| A | 2 | 2 | 102 |
| A | 3 | 3 | 103 |
| B | 1 | 4 | 104 |
| B | 2 | (null) | (null) |
| B | 3 | 5 | 105 |
you just need a pivot table
the query is very simple
select classes.M, p.i as D, t.ID, t.C
from (select M, max(D) MaxValue from t group by m) classes
inner join pivot p
on p.i =< classes.MaxValue
left join t
on t.M = classes.M
and t.D = p.i
pivot table is a dummy table some how
CREATE TABLE Pivot (
i INT,
PRIMARY KEY(i)
)
populate is some how
CREATE TABLE Foo(
i CHAR(1)
)
INSERT INTO Foo VALUES('0')
INSERT INTO Foo VALUES('1')
INSERT INTO Foo VALUES('2')
INSERT INTO Foo VALUES('3')
INSERT INTO Foo VALUES('4')
INSERT INTO Foo VALUES('5')
INSERT INTO Foo VALUES('6')
INSERT INTO Foo VALUES('7')
INSERT INTO Foo VALUES('8')
INSERT INTO Foo VALUES('9')
Using the 10 rows in the Foo table, you can easily populate the Pivot table with 1,000 rows. To get 1,000 rows from 10 rows, join Foo to itself three times to create a Cartesian product:
INSERT INTO Pivot
SELECT f1.i+f2.i+f3.i
FROM Foo f1, Foo F2, Foo f3
you can read about that in Transac-SQL Cookbook by Jonathan Gennick, Ales Spetic
You just need to order by the final column definitions. t.m and t.d. SO your final SQL would be...
SELECT *
FROM (SELECT *
FROM (SELECT DISTINCT m FROM t) AS dummy1,
(SELECT DISTINCT d FROM t) AS dummy2) AS combi
FULL OUTER JOIN t
ON combi.d = t.d
AND combi.m = t.m
ORDER BY t.m,
t.d;
Also for query optimization perspective, it is better to now have many layers of sub queries.
I think you need another correlation name - dummy3? - after 'as result )' before the order by.