I have a table like below
id name dependency
-----------------------
1 xxxx 0
2 yyyy 1
3 zzzz 2
4 aaaaaa 0
5 bbbbbb 4
6 cccccc 5
the list goes on. I want to select group of rows from this table , by giving the name of 0 dependency in where clause of SQL and till it reaches a condition where there is no more dependency. (For ex. rows 1,2, 3 forms a group, and rows 4,5,6 is another group) .please help
Since you did not specify a product, I'll go with features available in the SQL specification. In this case, I'm using a common-table expression which are supported by many database products including SQL Server 2005+ and Oracle (but not MySQL):
With MyDependents As
(
Select id, name, 0 As level
From MyTable
Where dependency = 0
And name = 'some value'
Union All
Select T.id, T.name, T.Level + 1
From MyDependents As D
Join MyTable As T
On T.id = D.dependency
)
Select id, name, level
From MyDependents
Another solution which does not rely on common-table expressions but does assume a maximum level of depth (in this case two levels below level 0) would something like
Select T1.id, T1.name, 0 As level
From MyTable As T1
Where T1.name = 'some value'
Union All
Select T2.id, T2.name, 1
From MyTable As T1
Join MyTable As T2
On T2.Id = T1.Dependency
Where T1.name = 'some value'
Union All
Select T3.id, T3.name, 2
From MyTable As T1
Join MyTable As T2
On T2.Id = T1.Dependency
Join MyTable As T3
On T3.Id = T2.Dependency
Where T1.name = 'some value'
Sounds like you want to recursively query your table, for which you will need a Common Table Expression (CTE)
This MSDN article explains CTEs very well. They are confusing at first but surprisingly easy to implement.
BTW this is obviously only for SQL Server, I'm not sure how you'd achieve that in MySQL.
This is the first thing that came to mind. It can be probably done more directly/succinctly, I'll try to dwell on it a little.
SELECT *
FROM table T1
WHERE T1.id >=
(SELECT T2.id FROM table T2 WHERE T2.name = '---NAME HERE---')
AND T1.id <
(SELECT MIN(id)
FROM table T3
WHERE T3.dependency = 0 AND T3.id > T2.id)
If you can estimate a max depth, this works out to something like:
SELECT
COALESCE(t4.field1, t3.field1, t2.field1, t1.field1, t.field1),
COALESCE(t4.field2, t3.field2, t2.field2, t1.field2, t.field2),
COALESCE(t4.field3, t3.field3, t2.field3, t1.field3, t.field3),
....
FROM table AS t
LEFT JOIN table AS t1 ON t.dependency = t1.id
LEFT JOIN table AS t2 ON t1.dependency = t2.id
LEFT JOIN table AS t3 ON t2.dependency = t3.id
LEFT JOIN table AS t4 ON t3.dependency = t4.id
....
This is a wild guess just to be different, but I think it's kind of pretty, anyway. And it's at least as portable as any of the others. But I don't want to look to closely; I'd want to use sensible data, start testing, and check for sensible results.
Hierarchical query will do:
SELECT *
FROM your_table
START WITH id = :id_of_group_header_row
CONNECT BY dependency = PRIOR id
Query works like this:
1. select all rows satisfying START WITH condition (this rows are roots now)
2. select all rows satisfying CONNECT BY condition,
keyword PRIOR means this column's value will be taken from the root row
3. consider rows selected on step 2 to be roots
4. go to step 2 until there are no more rows
Related
I'm quite new to SQL so bare with me. What I'm trying to do is return the value closest to another value in a different table for every record.
I'll show a simplified example of my two tables for clarification
First table is the one that I want the value ENTRY_YEAR matched to:
ID
ENTRY_VALUE
1001
1900
1002
2000
And the second table:
ID
ENTRY_VALUE
STATUS
1001
1880
SUCCES
1001
1930
FAIL
1001
1940
SUCCES
1002
1960
SUCCES
1002
1980
FAIL
So the end result I'm looking for is:
ID
ENTRY_VALUE
STATUS
1001
1880
SUCCES
1002
1980
FAIL
I have currently only managed to link the id's together but can't find a way to compare the ENTRY_VALUE in both tables and return the one closest to the Table1 entry.
So only this:
SELECT * from Table2
INNER JOIN Table1 ON (Table2.ID = Table1.ID)
Once again my bad for the basic question, I have googled right about everything but can't get it to work so any help is very welcome!
First attempt
This is a (slower performing) query. First attempt! This is an approach using a "correlated subquery" so it runs the inner query for each row of the outer query. The strategy is, for each row, to determine what the min value is we are looking for, and then select only the rows that fit that criteria. But such queries can be slow at runtime, although the logic is very clean.
select
a.id,
b.entry_value,
b.[status]
from
Foo a
inner join Bar b
on a.id = b.id
where
abs(a.entry_value - b.entry_value) =
(select min(abs(t1.entry_value-t2.entry_value))
from Foo t1
inner join Bar t2
on t1.id = t2.id
where
t1.id = a.id
group by t1.id)
Second attempt
If you have many rows (in the tens of thousands or in any case if the previous query is just too slow), then this next one should be better performing. Second Attempt! If you run the two inner queries by themselves, you will probably see the strategy here of how we are joining them to get the desired result.
select A.Id, A.entry_value, A.[status]
from
(
select t1.id, t2.entry_value, abs(t1.entry_value-t2.entry_value) as diff, t2.[status]
from Foo t1
inner join Bar t2
on t1.id = t2.id
) A
inner join
(
select t3.id, min(abs(t3.entry_value-t4.entry_value)) as diff
from Foo t3
inner join Bar t4
on t3.id = t4.id
group by t3.id
) B
on A.id = B.id
and A.diff = B.diff
Note
I would probably not try to write either of these queries in MSAccess "Design view" although if I had too I am sure I could. But generally, this is a case where I would write the query "by hand" and paste it into your query directly using MSAccess "SQL view".
Caution
Beware that ties will result in two rows! Example:
First table has (1003,2000)
Second table has (1003, 1990, 'success') and (1003, 2010, 'fail')
You will have a result with two rows, one with success and the other with fail (!)
So you really should test with your data and look for such cases that might produce such ties (and decide what to do, if necessary).
Btw...
just for fun, here's how you might go for it in SQL Server.
But I think this will NOT work in MSAccess, unfortunately.
select
T.id,
T.entry_value,
T.[status]
from
(
select
t1.id,
t2.entry_value,
abs(t1.entry_value-t2.entry_value) as diff,
t2.[status],
rank() over (partition by t1.id order by abs(t1.entry_value-t2.entry_value)) as seq
from #Foo t1
inner join #Bar t2
on t1.id = t2.id
) T
where T.seq = 1;
Use a simple subquery to find the minimum offset:
Select
tbl1.ID,
tbl2.ENTRY_VALUE,
tbl2.STATUS
From
tbl1
Inner Join
tbl2 On tbl1.ID = tbl2.ID
Where
Abs([tbl1].[ENTRY_VALUE] - [tbl2].[ENTRY_VALUE]) =
(Select Min(Abs([tbl1].[ENTRY_VALUE] - [T2].[ENTRY_VALUE])) As Offset
From tbl2 As T2
Where T2.ID = tbl1.ID);
Output:
ID
ENTRY_VALUE
STATUS
1001
1880
SUCCES
1002
1980
FAIL
Note, that if the minimum offset for an ID exists twice, both records having this offset will be returned. Thus, you may have to aggregate the output.
In MSAccess I have the query below. When I try to run the query gives the error Your query does not include the specified expression "ID" as part of aggregate function and I can't find the reason. What is the problem in my query?
SELECT
Count(t2.subjectid) AS CountOfsubjectid,
t2.pname,
(
select
max(outcometime)
from
table1 t1
where
t1.id = t2.id
)
AS showntime
FROM
table2 AS t2
WHERE
t2.outcome = "accepted"
GROUP BY
t2.pname,
t2.showntime;
UPDATE (SAMPLE DATA):
Table1:
ID outcometime pname outcome subjectid
1 20181111 USB shown Ux1ku
1 20181113 USB shown Ux1ku
2 20181115 USB shown Tsn2f
3 20181116 USB shown O93nf
2 20181114 USB shown Tsn2f
2 20181112 USB shown Tsn2f
Table2:
ID outcometime pname outcome subjectid
1 20181118 USB accepted Ux1ku
2 20181119 USB accepted Tsn2f
3 20181117 USB accepted O93nf
Desired Result:
pname showntime countofsubjectid
USB 20181113 1
USB 20181115 1
USB 20181116 1
Also updated the sample data. It was wrong.
Thanks.
Currently, you are attempting to run a correlated subquery in the SELECT clause encapsulated in an aggregate query and then reference this very subquery by alias in the GROUP BY clause.
Consider using a derived table to first run your unit level with subquery and then in outer main query run your aggregation.
SELECT
dt.pname,
COUNT(subjectid) AS CountOfsubjectid,
dt.showntime
FROM
(SELECT
t2.subjectid
t2.pname,
(
select
max(outcometime)
from
table1 t1
where
t1.id = t2.id
)
AS showntime
FROM
table2 AS t2
WHERE
t2.outcome = 'accepted'
) AS dt
GROUP BY
dt.pname,
dt.showntime;
However, consider avoiding the inefficient correlated subquery to run for every row in table to joining on an aggregate query for the MAX calculated once and then run aggregation again for COUNT on main level.
SELECT
t2.pname,
COUNT(t.subjectid) AS CountOfsubjectid,
agg.showntime
FROM
table2 AS t2
INNER JOIN
(
select
t1.id,
max(outcometime) as showntime
from
table1 t1
group by
t1.id
) AS agg
ON t2.id = agg.id
WHERE
t2.outcome = 'accepted'
GROUP BY
t2.pname,
agg.showntime;
If I understand correctly, you need to write this as:
SELECT Count(t2.subjectid) AS CountOfsubjectid,
t2.name,
(select max(outcometime)
from table1 as t1 inner join
table2 as tt2
on t1.id = tt2.id
where tt2.name = t2.name
) as showntime
FROM table2 AS t2
WHERE t2.outcome = "accepted"
GROUP BY t2.name;
Assume there is a table name "test" below:
name value
n1 1
n2 2
n3 3
Now, I want to get the name which has the max value, I have some solution below:
Solution 1:
SELECT TOP 1 name
FROM test
ORDER BY value DESC
solution 2:
SELECT name
FROM test
WHERE value = (SELECT MAX(value) FROM test);
Now, I hope use join operation to find the result, like
SELECT name
FROM test
INNER JOIN test ON...
Could someone please help and explain how it works?
If you are looking for JOIN then
SELECT T.name, T.value
FROM test T
INNER JOIN
( SELECT T1.name, T1.value ,
RANK() OVER (PARTITION BY T1.name ORDER BY T1.value) N
FROM test T1
WHERE T1.value IN (SELECT MAX(t2.value) FROM test T2)
)T3 ON T3.N = 1 AND T.name = T3.name
FIDDLE DEMO
or
select name, value
from
(
select name, value,
row_number() over(order by value desc) rn
from test
) src
where rn = 1
FIDDLE DEMO
First, note that solutions 1 and 2 could give different results when value is not unique. If in your test data there would be an additional record ('n4', 3), then solution 1 would return either 'n3' or 'n4', but solution 2 would return both.
A solution with JOIN will need aliases for the table, because as you started of, the engine would say Ambiguous column name 'name'.: it would not know whether to take name from the first or second occurrence of the test table.
Here is a way to complete the JOIN version:
SELECT t1.name
FROM test t1
LEFT JOIN test t2
ON t2.value > t1.value
WHERE t2.value IS NULL;
This query takes each of the records, and checks if any records exist that have a higher value. If not, the first record will be in the result. Note the use of LEFT: this denotes an outer join, so that records from t1 that have no match with t2 -- based on the ON condition -- are not immediately rejected (as would be the case with INNER): in fact, we want to reject all the other records, which is done with the WHERE clause.
A way to understand this mechanism, is to look at a variant of the query above, which lacks the WHERE clause and returns the values of both tables:
SELECT t1.value, t2.value
FROM test t1
LEFT JOIN test t2
ON t2.value > t1.value
On your test data this will return:
t1.value t2.value
1 2
1 3
2 3
3 (null)
Note that the last entry would not be there if the join where an INNER JOIN. But with the outer join, one can now look for the NULL values and actually get those records in the result that would be excluded from an INNER JOIN.
Note that this query will give the same result as solution 2 when there are duplicate values. If you want to have also only one result like with solution 1, it suffices to add TOP 1 after SELECT.
Here is a fiddle.
Alternative with pure INNER JOIN
If you really want an INNER join, then this will do it. Again the TOP 1 is only needed if you have non-unique values:
SELECT TOP 1 t1.name
FROM test t1
INNER JOIN (SELECT Max(value) AS value FROM test) t2
ON t2.value = t1.value;
But this one really is very similar to what you did in solution 2. Here is fiddle for it.
Here's my data:
Table1...
id
field1
field2
...
Table2...
id
table1_original_id
table1_new_id
Table1 holds records that cannot themselves be updated though I built a mechanism for my users to be able to "update" them... they select a record, make changes, and those changes are actually submitted as a new record. For conversation's sake let's assume there are currently 10 records in Table1 with IDs 1 through 10. User submits an update to ID 3. ID 3 remains as it was and a new record, ID 11, is added to Table1. Along with this new record, Table2 gets a new record, with ID = 1, t1_original_id = 3 and t1_new_id = 11.
What would be by SQL select to retrieve the "pairs" of records from Table1... in this case the query would provide me with... IDs 3 and 11.
scratching head
I don't think it matters much by DB platform is Postgres 8.4 and I'm retrieving the data via PHP to be processed in jqGrid. Bonus points if you can point me in the direction of displaying each pair of records in a separate jqGrid!
Thanks for your time and effort.
=== EDIT ===
The following is a sample of what I need returned from the query based on the scenario above:
id field1 field2
3 blah stuff
11 more things
Once I have these pairs back I can process them further, as necessary, on the PHP side.
Standard SQL solution
To get two separate rows (as later specified in the Q update) use a UNION query:
SELECT 'old' AS version, t1.id, t1.field1, t1.field2
FROM table1 t1
JOIN table2 t2 ON t2.table1_original_id = t1.id
WHERE t2.id = $some_t2_id -- your t2.id here!
UNION ALL
SELECT 'new' as version, t1.id, t1.field1, t1.field2
FROM table1 t1
JOIN table2 t2 ON t2.table1_new_id = t1.id
WHERE t2.id = $some_t2_id; -- your t2.id here!
Note: this is one query.
Returns as requested, plus a column version to distinguish the rows reliably:
version | id | field1 | field2
--------+----+--------+--------
old | 3 | blah | stuff
new | 11 | more | things
Faster Postgres-specific solution
A more sophisticated, but shorter and faster solution:
SELECT version, id, field1, field2
FROM (
SELECT unnest(ARRAY['old', 'new']) AS version
,unnest(ARRAY[table1_original_id, table1_new_id]) AS id
FROM table2 t2 WHERE t2.id = $some_t2_id -- your t2.id here!
) t2
JOIN table1 USING (id);
Same result.
-> SQLfiddle demo for both.
You'll need to JOIN to Table1 twice, something like:
SELECT orig.id AS Orig_ID
, orig.value AS Orig_Value
, n.id AS New_ID
, n.value AS New_Value
FROM Table2 a
JOIN Table1 AS orig
ON a.table1_original_id = orig.id
JOIN Table1 AS n
ON a.table1_new_id = n.id
Demo: SQL Fiddle
Update:
To get them paired as you want without manually choosing a set you'll need something like this:
SELECT sub.id,sub.value
FROM (SELECT a.id as Update_ID,o.id,o.value, '1' AS sort
FROM Table2 a
JOIN Table1 AS o
ON a.table1_original_id = o.id
UNION
SELECT a.id as Update_ID, n.id,n.value, '2' AS sort
FROM Table2 a
JOIN Table1 AS n
ON a.table1_new_id = n.id
) AS sub
ORDER BY Update_ID,sort
Demo: Sql Fiddle
Notes: Change UNION to UNION + ALL, can't post those words next to each other due to firewall limitation.
The hard-coded '1' and '2' are so that original always appear before new.
I want an SQL code which should perform the task of data scrubbing.
I have two tables both contain some names I want to compare them and list out only those name which are in table 2 but not in table 1.
Example:
Table 1 = A ,B,C
Table 2 = C,D,E
The result must have D and E?
SELECT t2.name
FROM 2 t2
LEFT JOIN
1 t1 ON t1.name=t2.name
WHERE t1.name IS NULL
select T2.Name
from Table2 as T2
where not exists (select * from Table1 as T1 where T1.Name = T2.Name)
See this article about performance of different implementations of anti-join (for SQL Server).
select t2.name
from t2,t1
where t2.name<>t1.name -- ( or t2.name!=t1.name)
If the DBMS supports it:
select name from table2
minus
select name from table1
A more portable solution could also be:
select name from table2
where name not in (select name from table1)