SQL - Querying top value for multiple columns - sql

Table1:
Name, Value A, Value B, Value C
I would like to find the largest Name by Value A, the largest Name by Value B and the largest Name by Value C. Does anyone have a quick way to do this? The table itself is rather large and I would really want to avoid running through it multiple times for each value.
Thank you!

If you have an index on each of the columns (a), (b), and (c), you can do:
select t.*
from t
where t.a = (select max(t2.a) from t t2) or
t.b = (select max(t2.b) from t t2) or
t.c = (select max(t2.c) from t t2) ;
The where clause should be able to make use of the indexes. If not, you can split this into subqueries. Something like this:
select t.*
from t
where t.a = (select max(t2.a) from t t2)
union all
select t.*
from t
where t.b = (select max(t2.b) from t t2)
select t.*
from t
where t.c = (select max(t2.c) from t t2) ;

Related

sql query to add a column to select view

How can I write a select query to view a few columns from a table and add additional columns to it with a default value assigned?
Like Select a,b,c, d="TIM" from table1;,
where a,b and c are columns in table1, but "d" isn't.
Like this
select a, b, c, 'TIM' as d
from your_table
You can just select a constant value:
Select t1.a, t1.b, t1.c, 'TIM' as d
from table1 t1;
Note that SQL in general -- and Oracle in particular -- uses single quotes to delimit strings.
I assume you want to fetch the rows from table1 where d='TIM' from another table and the ids of these tables are their common fields:
SELECT t1.a,t1.b,t1.c,t2.d
FROM table1 t1
JOIN table2 t2 ON t1.id = t2.id
WHERE t2.d = 'TIM';

Why do parentheses work in SELECT queries but no sub queries?

Why is this a valid query
SELECT T.A FROM
(SELECT A, B
FROM test) T ;
and this:
(SELECT DISTINCT(A,B)
FROM test);
but not this:
SELECT T.A FROM
(SELECT DISTINCT(A, B)
FROM test) T ;
(specifically in postgresql, but I suspect other sql too)?
UPDATE:
Postgres fails with:
ERROR: column t.a does not exist
LINE 1: SELECT T.A FROM
Changing the query to
SELECT T.A FROM
(SELECT DISTINCT A, B
FROM test) T ;
succeeds, where
SELECT T.A FROM
(SELECT DISTINCT (A, B)
FROM test) T ;
fails. Why?
When you do this:
SELECT T.A FROM
(SELECT DISTINCT(A, B)
FROM test) T ;
Your (A,B) syntax is causing that to be returned as a record datatype. As such your subquery (T) sees rows of an anonymous record coming back, not individual fields A and B.
Without the parentheses, they are treated like normal fields, which appears to be what you want.
You have to assign an alias from your sub-query otherwise it couldn't identify the column name from your first seq query...
SELECT T.A FROM
(SELECT DISTINCT(A, B) AS A
FROM test) T ;

Reducing the list of results (SQL)

I stuck on an SQL statement since 2 days now and I hope you can help me with that.
The result of my select is a list with 4 attributes A, B, C and D (below is an example list of 5 datasets):
1. A=1 B=100 C=200 D=300
2. A=2 B=200 C=100 D=300
3. A=3 B=300 C=200 D=100
4. A=3 B=100 C=100 D=200
5. A=3 B=300 C=100 D=200
The list shall be reduced, so that every attribute A is in the list only once.
In the example above the dataset 1. and 2. should be in the list, because A=1 and A=2 exists only once.
For A=3 I have to build a query to identify the dataset, that will be in the final list. Some rules should apply:
Take the dataset with the highest value of B; if not distinct then
Take the dataset with the highest value of C; if not distinct then
Take the dataset with the highest value of D.
In the example above the dataset 3. should be taken.
The expected result is:
1.A=1 B=100 C=200 D=300
2.A=2 B=200 C=100 D=300
3.A=3 B=300 C=200 D=100
I hope you understand my problem. I've tried various versions of SELECT-statements with HAVING and EXISTS (or NOT EXISTS), but my SQL knowledge isn't enough.
Probably there is an easier way to solve this problem, but this one works:
CREATE TEMP TABLE TEST (
A INTEGER,
B INTEGER,
C INTEGER,
D INTEGER
);
insert into TEST values (1,1,1,1);
insert into TEST values (2,1,5,1);
insert into TEST values (2,2,1,1);
insert into TEST values (3,1,4,1);
insert into TEST values (3,2,1,4);
insert into TEST values (3,2,3,1);
insert into TEST values (3,3,1,5);
insert into TEST values (3,3,2,3);
insert into TEST values (3,3,2,7);
insert into TEST values (3,3,3,1);
insert into TEST values (3,3,3,2);
select distinct
t1.A,
t2.B as B,
t3.C as C,
t4.D as D
from TEST t1
join (select A ,MAX (B) as B from TEST group by A)t2 on t2.A=t1.A
join (select A, B, MAX(C) as C from TEST group by A,B)t3 on t3.A=t2.A and t3.B=t2.B
join (select A, B, C, MAX (D) as D from TEST group by A,B,C)t4 on t4.A=t3.A and t4.B=t3.B and t4.C=t3.C;
Result:
a b c d
1 1 1 1
2 2 1 1
3 3 3 2
Tested on IBM Informix Dynamic Server Version 11.10.FC3.
This type of prioritization query is most easily done with row_number(), but I don't think Informix supports that.
So, one method is to enumerate the rows using a correlated subquery:
select t.*
from (select t.*,
(select count(*)
from t t2
where (t2.b > t.b) or
(t2.b = t.b and t2.c > t.c) or
(t2.b = t.b and t2.c = t.c and t2.d > t.d)
) as NumGreater
from t
) t
where NumGreater = 0;
I have no idea about Informix but you can try. This works in Sql Server. May be it will also work in Informix:
select * from tablename t1
where id = (select first 1 id from tablename t2
where t2.A = t1.A order by B desc, C desc, D desc)
SELECT A, MAX(B) AS B, MAX(C) AS C, MAX(D) AS D
FROM table_name
GROUP BY A

Modify: Query which uses Group By/Having clauses, to another query which uses just Select/From/Where

Can a query which uses Group By/Having clauses, be modified to another query which uses just Select/From/Where clauses?
TABLE T(a, b, c)
SELECT a, sum(c)
FROM T
WHERE b>10
GROUP BY a
HAVING sum(c)>5
Would appreciate it if you could explain in detail why it can(not) be done.
You could, of course, resort to using window functions only, if your specific database supports those:
SELECT a, s
FROM (
SELECT DISTINCT a, sum(c) OVER (PARTITION BY a) s
FROM t1
WHERE b > 10
) t2
WHERE s > 5
Another option is to use correlated subqueries, which work on all databases:
SELECT a, s
FROM (
SELECT DISTINCT a, (SELECT sum(c) FROM t t3 WHERE t1.a = t3.a AND b > 10) s
FROM t t1
WHERE b > 10
) t2
WHERE s > 5
These alternatives would yield the same result without using GROUP BY or HAVING. But either of these would be (much) slower, and I don't really see the point...

combination of unique values a column in a table

I need to find the all unique possible combination of values in a column in a table. For example, for column values 1,2,3,4,5. i want the result to be [1,2],[1,3],[1,4],[1,5],[2,1],[2,3] etc.
Will appreciate any pointers to construct the query to find the combination of the values.
thanks
You can do a cross join in BigQuery by using a subselect that adds a constant key value, then joining on that constant value.
For example, here is a query that will compute the cross join of {1, 2, 3} and {2, 4, 6}:
SELECT t1.num as first, t2.num as second
FROM (
SELECT num, 1 as key
FROM (
SELECT 1 as num), (
SELECT 2 as num), (
SELECT 3 as num)) as t1
JOIN (
SELECT num, 1 as key
FROM (
SELECT 2 as num), (
SELECT 4 as num), (
SELECT 6 as num)) as t2
ON t1.key = t2.key
WHERE t1.num <> t2.num
Note this uses a BigQuery "trick" to create the two input tables. If you were just doing this with an existing table, it would look like:
SELECT t1.num as first, t2.num as second
FROM (
SELECT foo as num, 1 as key
FROM [my_dataset.my_table]) as t1
JOIN (
SELECT foo as num, 1 as key
FROM [my_dataset.my_table]) as t2
ON t1.key = t2.key
WHERE t1.num <> t2.num
A cross join might be usefull.
See this demo: http://www.sqlfiddle.com/#!12/59af5/1
The ANSI SQL syntax uses a CROSS JOIN operator:
create table val( x int );
insert into val values(1),(2),(3),(4),(5);
SELECT a.x a, b.x b
FROM val a
CROSS JOIN val b
WHERE a.x <> b.x
ORDER BY a,b;
Another form of this query without CROSS JOIN should work on most DBMS system, but ANSI form is recommended for clearness:
SELECT a.x a, b.x b
FROM val a, val b
WHERE a.x <> b.x
ORDER BY a,b;
Beware that the cross join for large datasets can kill your database performance, for 100 values it generates 100x100 = 10.000 rows, for 1000 --> 1.000.000 rows.