Sum multiple columns using a subquery - sql

I'm trying to play with Oracle's DB.
I'm trying to sum two columns from the same row and output a total on the fly.
However, I can't seem to get it to work. Here's the code I have so far.
SELECT a.name , SUM(b.sequence + b.length) as total
FROM (
SELECT a.name, a.sequence, b.length
FROM tbl1 a, tbl2 b
WHERE b.sequence = a.sequence
AND a.loc <> -1
AND a.id='10201'
ORDER BY a.location
)
The inner query works, but I can't seem to make the new query and the subquery work together.
Here's a sample table I'm using:
...[name][sequence][length]...
...['aa']['100000']['2000']...
...
...['za']['200000']['3001']...
And here's the output I'd like:
[name][ total ]
['aa']['102000']
...
['za']['203001']
Help much appreciated, thanks!

SUM() sums number across rows. Instead replace it with sequence + length.
...or if there is the possibility of NULL values occurring in either the sequence or length columns, use: COALESCE(sequence, 0) + COALESCE(length, 0).
Or, if your intention was indeed to produce a running total (i.e. aggregating the sum of all the totals and lengths for each user), add a GROUP BY a.name after the end of the subquery.
BTW: you shouldn't be referencing the internal aliases used inside a subquery from outside of that subquery. Some DB servers allow it (and I don't have convenient access to an Oracle server right now, so I can test it), but it's not really good practice.
I think what you are after is something like:
SELECT a.name,
SUM(B.sequence + B.length) AS total
FROM Tbl1 A
INNER JOIN Tbl2 B
ON B.sequence = A.sequence
WHERE A.loc <> -1
AND A.id = 10201
GROUP BY a.name
ORDER BY A.location

Your query with the subquery fails for several reasons:
You use the table alias a, but it is not defined.
You use the table alias b, but it is not defined.
You have a sum() in the select clause with unaggregated columns, but no group by.
In addition, you have an order by in the subquery which is allowed syntactically, but ignored.
Here is a better way to write the query without a subquery:
SELECT t1.name, (t1.sequence + t2.length) as total
FROM tbl1 t1 join
tbl2 t2
on t1.sequence = t2.sequence
where t1.loc <> -1 AND t1.id = '10201'
ORDER BY t1.location;
Note the use of proper join syntax, the use of aliases that make sense, and the simple calculation at this level.
Here is a version with a subquery:
select name, (sequence + length) as total
from (SELECT t1.name, t1.sequence, t2.length
FROM tbl1 t1 join
tbl2 t2
on t1.sequence = t2.sequence
where t1.loc <> -1 AND t1.id = '10201'
) t
ORDER BY location;
Note that the order by is going at the outer level. And, I gave the subquery an alias. This is not strictly required, but typically a good idea.

Related

Snowflake, SQL where clause

I need to write query with where clause:
where
pl.ods_site_id in (select id from table1 where ...)
But if subquery (table1) didn't return anything, where clause doesn't need to include in result query (like it returns TRUE).
How can I do it? (I have snowflake SQL dialect)
You could include a second condition:
where pl.ods_site_id in (select id from table1 where ...) or
not exists (select id from table1 where ...)
This explicitly checks for the subquery returning no rows.
If you are willing to use a join instead, Snowflake supports qualify clause which might come in handy here. You can run this on Snowflake to see how it works.
with
pl (ods_site_id) as (select 1 union all select 5),
table1 (id) as (select 5) --change this to 7 to test if it returns ALL on no match
select a.*
from pl a
left join table1 b on a.ods_site_id = b.id -- and other conditions you want to add
qualify b.id = a.ods_site_id --either match the join condition
or count(b.id) over () = 0; --or make sure there is 0 match from table1

Oracle SQL query optimization - getting counts based on a varchar field

Optimizing a query
I have a query getting data from one table and getting two counts from two other tables based
on a varchar field TYPE. I need to get count from TABLE2 where TYPE=TABLE1.TYPE and
count from TABLE3 where TYPE=TABLE1.TYPE
At this point I cannot create any indexes on those fields so I decided to use functions which brought my original query execution time
down to 5 seconds which is still too much. Any suggestions on how to further optimize my query?
SELECT a.ID,
a.FIELD1,
a.FIELD2,
a.TYPE,
GET_COUNT_1(a.TYPE) as COUNT1,
GET_COUNT_2(a.TYPE) as COUNT2,
FROM TABLE1 a
my original query was:
SELECT a.ID,
a.FIELD1,
a.FIELD2,
a.TYPE,
(SELECT COUNT(*) FROM TABLE2 b WHERE b.TYPE=a.TYPE) as COUNT1,
(SELECT COUNT(*) FROM TABLE3 c WHERE c.TYPE=a.TYPE) as COUNT2
FROM TABLE1 a
If you do not have index on the table2(TYPE) it is deadly to use subquery as you will repeatedly (for each row of TABLE1) perform a FULL TABLE SCAN.
Aparently the Oracle subquery cashing, that could save you, did not kick in.
The function approach will be not much better, except you implement some fucntion result caching on your own.
But there is a simple solution to precalculate the counts in a subquery and join the result to TABLE1.
Note that you calculates the count only once for each type and not for each row of the TABLE1
with cnt as
(select type, count(*) cnt
from table2 group by type),
cnt2 as
(select type, count(*) cnt
from table3 group by type)
select a.ID,
a.FIELD1,
a.FIELD2,
a.TYPE,
b.cnt cnt1
c.cnt cnt2
from TABLE1 a
left outer join cnt b
on a.type = b.type
left outer join cnt2 c
on a.type = c.type
You will end with one FTS for each table, aggregation and outer join, which is the minimum you need to do.
For your query, you want an index on table2(type).
The two subqueries are exactly the same, except for the table alias. If you really have two different tables, or if you are using different columns, then you'll want the appropriate index for that expression.

sql - ignore duplicates while joining

I have two tables.
Table1 is 1591 rows. Table2 is 270 rows.
I want to fetch specific column data from Table2 based on some condition between them and also exclude duplicates which are in Table2. Which I mean to join the tables but get only one value from Table2 even if the condition has occurred more than time. The result should be exactly 1591 rows.
I tried to make Left,Right, Inner joins but the data comes more than or less 1591.
Example
Table1
type,address,name
40,blabla,Adam
20,blablabla,Joe
Table2
type,currency
40,usd
40,gbp
40,omr
Joining on 'type'
Result
type,address,name,currency
40,blabla,name,usd
20,blblbla,Joe,null
try this it has to work
select *
from
Table1 h
inner join
(select type,currency,ROW_NUMBER()over (partition by type order by
currency) as rn
from
Table2
) sr on
sr.type=h.type
and rn=1
Try this. It's standard SQL, therefore, it should work on your rdbms system.
select * from Table1 AS t
LEFT OUTER JOIN Table2 AS y ON t.[type] = y.[type] and y.currency IN (SELECT MAX(currency) FROM Table2 GROUP BY [type])
If you want to control which currency is joined, consider altering Table2 by adding a new column active/non active and modifying accordingly the JOIN clause.
You can use outer apply if it's supported.
select a.type, a.address, a.name, b.currency
from Table1 a
outer apply (
select top 1 currency
from Table2
where Table2.type = a.type
) b
I typical way to do this uses a correlated subquery. This guarantees that all rows in the first table are kept. And it generates an error if more than one row is returned from the second.
So:
select t1.*,
(select t2.currency
from table2 t2
where t2.type = t1.type
fetch first 1 row only
) as currency
from table1 t1;
You don't specify what database you are using, so this uses standard syntax for returning one row. Some databases use limit or top instead.

Redshift Query returning too many rows in aggregate join

I am sure I must be missing something obvious. I am trying to line up two tables with different measurement data for analysis, and my counts are coming back enormously high when I join the two tables together.
Here are the correct counts from my table1
select line_item_id,sum(is_imp) as imps
from table1
where line_item_id=5993252
group by 1;
Here are the correct counts from table2
select cs_line_item_id,sum(grossImpressions) as cs_imps
from table2
where cs_line_item_id=5993252
group by 1;
When I join the tables together, my counts become inaccurate:
select a.line_item_id,sum(a.is_imp) as imps,sum(c.grossImpressions) as cs_imps
from table1 a join table2 c
ON a.line_item_id=c.cs_line_item_id
where a.line_item_id=5993252
group by 1;
I'm using aggregates, group by, filtering, so I'm not sure where I'm going wrong. Here is the schema for these tables:
select a.*, b.imps table2_imps from
(select line_item_id,sum(is_imp) as imps
from table1
group by 1)a
join
(select line_item_id,sum(is_imp) as imps
from table1
group by 1)b
on a.select line_item_id=b.select line_item_id
You are generating a Cartesian product for each line_item_id. There are two relatively simply ways to solve this, one with a full join, the other with union all:
select line_item_id, sum(imps) as imps, sum(grossImpressions) as cs_imps
from ((select a.line_time_id, sum(is_imp) as imps, 0 as grossImpressions
from table1 a
where a.line_item_id = 5993252
group by a.line_item_id
) union all
(select c.line_time_id, 0 as imps, sum(grossImpressions) as grossImpressions
from table2 c
where c.line_item_id = 5993252
group by c.line_item_id
)
) ac
group by line_item_id;
You can remove the where clause from the subqueries to get the total for all line_tiem_ids. Note that this works even when one or the other table has no matching rows for a given line_item_id.
For performance, you really want to do the filtering before the group by.

Firebird group clause

I can't to understand firebird group logic
Query:
SELECT t.id FROM T1 t
INNER JOIN T2 j ON j.id = t.jid
WHERE t.id = 1
GROUP BY t.id
works perfectly
But when I try to get other fields:
SELECT * FROM T1 t
INNER JOIN T2 j ON j.id = t.jid
WHERE t.id = 1
GROUP BY t.id
I get error: Invalid expression in the select list (not contained in either an aggregate function or the GROUP BY clause)
When you use GROUP BY in your query, the field or fields specified are used as 'keys', and data rows are grouped based on unique combinations of those 2 fields. In the result set, every such unique combination has one and only one row.
In your case, the only identifier in the group is t.id. Now consider that you have 2 records in the table, both with t.id = 1, but having different values for another column, say, t.name. If you try to select both id and name columns, it directly contradicts the constraint that one group can have only one row. That is why you cannot select any field apart from the group key.
For aggregate functions it is different. That is because, when you sum or count values or get the maximum, you are basically performing that operation only based on the id field, effectively ignoring the data in the other columns. So, there is no issue because there can only be one answer to, say, count of all names with a particular id.
In conclusion, if you want to show a column in the results, you need to group by it. This will however, make the grouping more granular, which may not be desirable. In that case, you can do something like this:
select * from T1 t
where t.id in
(SELECT t.id FROM T1 t
INNER JOIN T2 j ON j.id = t.jid
WHERE t.id = 1
GROUP BY t.id)
When you using GROUP BY clause in SELECT you should use only aggreagted functions or columns that listed in GROUP BY clause. More about GROUP BY clause:http://www.firebirdsql.org/manual/nullguide-aggrfunc.html
As example:
SELECT Max(t.jid), t.id FROM T1 t
INNER JOIN T2 j ON j.id = t.jid
WHERE t.id = 1
GROUP BY t.id
SELECT * FROM T1 t
INNER JOIN T2 j ON j.id = t.jid
WHERE t.id = 1
GROUP BY t.id
This will not execute,cause you have used t.id in group by, So all your columns in select clause should be using aggregate function , else those should be included in group by clause.
Select * means you are selecting all columns, so all columns except t.id are neither in group by nor in aggregate function.
Try this link, How to use GROUP BY in firebird