SQL - SUM of the for max ID - sql

I have a table like this,
| id | name | subtask | maintask |
|----|------|---------|----------|
| 1 | t1 | 11 | 20 |
| 1 | t1 | 12 | 20 |
| 1 | t1 | 1 | 30 |
| 2 | t1 | 2 | 20 |
| 2 | t1 | 2 | 20 |
I want to prepare a result like this
| id | name | sum_of_subtask | sum_of_maintask | diff |
|----|------|----------------|-----------------|------|
| 2 | t1 | 4 | 40 | 36 |
Need to pick the max ID, then do the sum for subtask and maintask, then the last column is the difference of sum(subtask) and sum(maintask)
I tried this below query, but its calculating the sum for all the columns.
select max(id), name, sum(subtask),sum(maintask),sum(subtask-maintask) from tbl
group by name

Do you just want one row? If so, use order by and limit:
select id, name, sum(subtask), sum(maintask), sum(subtask-maintask)
from tbl
group by id, name
order by id desc
limit 1;
If your data is large, it might be more efficient to filter before aggregating:
select id, name, sum(subtask), sum(maintask), sum(subtask-maintask)
from tbl
where id = (select max(id) from tbl)
group by id, name;
If you want the maximum id per name, then the filtering logic is:
select id, name, sum(subtask), sum(maintask), sum(subtask-maintask)
from tbl t
where t.id = (select max(t2.id) from tbl t2 where t2.name = t.name)
group by id, name;

Please use below query,
select id, name, sum(subtask), sum (maintask), sum(subtask)-sum (maintask)
where id in
(select max(id) from table)
group by id, name;

select id, name, sum(subtask), sum(maintask), sum(subtask-maintask)
from tbl
where id = (select max(id) from tbl)
group by id, name

Related

Join number of pairs in a single table using SQL

I have two tables of events in bigquery that look like as follows. The main idea is two count the number of events in each table (are always pairs of event_id and user_id) and join them in a single table that for each pair in any table it tells the number of events.
table 1:
| event_id | user id |
| -------- | ------- |
| 1 | 1 |
| 2 | 1 |
| 2 | 3 |
| 2 | 5 |
| 1 | 1 |
| 4 | 7 |
table 2:
| event_id | user id |
| -------- | ------- |
| 1 | 1 |
| 3 | 1 |
| 2 | 3 |
I would like to get a table which has the number of events of each table:
| event_id | user id | num_events_table1 | num_events_table2 |
| -------- | ------- | ----------------- | ----------------- |
| 1 | 1 | 2 | 1 |
| 2 | 1 | 1 | 0 |
| 2 | 3 | 1 | 1 |
| 2 | 5 | 1 | 0 |
| 4 | 7 | 1 | 0 |
| 3 | 1 | 0 | 1 |
Any idea of how to do this with sql? I have tried this:
SELECT i1, e1, num_viewed, num_displayed FROM
(SELECT id as i1, event as e1, count(*) as num_viewed
FROM table_1
group by id, event) a
full outer JOIN (SELECT id as i2, event as e2, count(*) as num_displayed
FROM table_2
group by id, event) b
on a.i1 = b.i2 and a.e1 = b.e2
This is not getting exactly what I want. I amb getting i1 which are null and e1 that are null.
Consider below
#standardSQL
with `project.dataset.table1` as (
select 1 event_id, 1 user_id union all
select 2, 1 union all
select 2, 3 union all
select 2, 5 union all
select 1, 1 union all
select 4, 7
), `project.dataset.table2` as (
select 1 event_id, 1 user_id union all
select 3, 1 union all
select 2, 3
)
select event_id, user_id,
countif(source = 1) as num_events_table1,
countif(source = 2) as num_events_table2
from (
select 1 source, * from `project.dataset.table1`
union all
select 2, * from `project.dataset.table2`
)
group by event_id, user_id
if applied to sample data in your question - output is
If I understand correctly, the simplest method is to modify your query via a USING clause along with COALESCE():
SELECT id, event, COALESCE(num_viewed, 0), COALESCE(num_displayed, 0)
FROM (SELECT id, event, count(*) as num_viewed
FROM table_1
GROUP BY id, event
) t1 FULL JOIN
(SELECT id , event, COUNT(*) as num_displayed
FROM table_2
GROUP BY id, event
) t2
USING (id, event);
Note: This requires that the two columns used for the JOIN have the same name. If this is not the case, then you might still need column aliases in the subqueries.
One way is aggregate the union
select event_id, user id, sum(cnt1) cnt1, sum(cnt2) cnt2
from (
select event_id, user id, 1 cnt1, 0 cnt2
from table_1
union all
select event_id, user id, 0 cnt1, 1 cnt2
from table_2 ) t
group by event_id, user id

SQL: Select only one row of table with same value

Im a bit new to sql and for my project I need to do some Database sorting and filtering:
Let's assume my database looks like this:
==========================================
| id | email | name
==========================================
| 1 | 123#test.com | John
| 2 | 234#test.com | Peter
| 3 | 234#test.com | Steward
| 4 | 123#test.com | Ethan
| 5 | 542#test.com | Bob
| 6 | 123#test.com | Patrick
==========================================
What should I do to only have the last column with the same email te be returned:
==========================================
| id | email | name
==========================================
| 3 | 234#test.com | Steward
| 5 | 542#test.com | Bob
| 6 | 123#test.com | Patrick
==========================================
Thanks in advance!
SQL Query:
SELECT * FROM test.test1 WHERE id IN (
SELECT MAX(id) FROM test.test1 GROUP BY email
);
Hope this solves your problem. Thanks.
A generic way to do this in SQL is to use the ANSI standard row_number() function:
select t.*
from (select t.*, row_number() over (partition by email order by id desc) as seqnum
from t
) t
where seqnum = 1;
Here is a clearer way:
SELECT *
FROM table
ORDER BY email DESC
LIMIT 1;
You can use following query to get the MAX id value per email:
SELECT email, MAX(id)
FROM mytable
GROUP BY email
Using the above query as a derived table you can obtain the whole record:
SELECT t1.*
FROM mytable AS t1
JOIN (
SELECT email, MAX(id) AS id
FROM mytable
GROUP BY email
) AS t2 ON t1.id = t2.id

select top 1 with max 2 fields

I have this table :
+------+-------+------------------------------------+
| id | rev | class |
+------+-------+------------------------------------+
| 1 | 10 | 2 |
| 1 | 10 | 5 |
| 2 | 40 | 6 |
| 2 | 50 | 6 |
| 2 | 52 | 1 |
| 3 | 33 | 3 |
| 3 | 63 | 5 |
+------+-------+------------------------------------+
I only need the rows where rev AND then class columns have max value.
+------+-------+------------------------------------+
| id | rev | class |
+------+-------+------------------------------------+
| 1 | 10 | 5 |
| 2 | 52 | 1 |
| 3 | 63 | 5 |
+------+-------+------------------------------------+
Query cost is important for me.
Just the rows that satisfy the condition that it has both max values?
Here's an SQL Fiddle;
SELECT h.id, h.rev, h.class
FROM ( SELECT id,
MAX( rev ) rev,
MAX( class ) class
FROM Herp
GROUP BY id ) derp
INNER JOIN Herp h
ON h.rev = derp.rev
AND h.class = derp.class;
The fastest way might be to have an index on t(id, rev) and t(id, class) and then do:
select t.*
from table t
where not exists (select 1
from table t2
where t2.id = t.id and t2.rev > t.rev
) and
not exists (select 1
from table t2
where t2.id = t.id and t2.class > t.class
);
SQL Server is pretty smart in terms of optimization, so the aggregation approach might be just as good. However, in terms of performance, this is just a bunch of index lookups.
Here is a SQL 2012 example. Very straight forward with the implied table and the PARTITION function.
Basically, with each ID as a partition/group, sort the values of the other fields in a descending order assigning each one an incrementing RowId, then only take the first one.
select id, rev, [class]
from
(
SELECT id, rev, [class],
ROW_NUMBER() OVER(PARTITION BY id ORDER BY rev DESC, [class] desc) AS RowId
FROM sample
) t
where RowId = 1
Here is the SQL Fiddle
Keep in mind, this works with the criteria in the example dataset, and not the MAX of two fields as stated in the question's title.
I guess you mean: the max of rev and the max of class. If not, please clarify what to do when there is no row where both fields have the highest value.
select id
, max(rev)
, max(class)
from table
group
by id
If you mean total value of rev and class use this:
select id
, max
, rev
from table
where id in
( select id
, max(rev + class)
from table
group
by id
)

SQL remove duplicates from GROUP BY results

I have a table with the following structure
sys_id(identity) | id | group_id | fld_id | val
-----------------------------------------------
I have a query
SELECT id,group_id,fld_id,val,COUNT(*)
FROM [DB_ALERT].[dbo].[DATATABLE]
GROUP BY id,group_id,fld_id,val
HAVING COUNT(*)>1
The resul set is like this
ID | group_id | fld_id | val| count(*)
__________________________________________
1000001| 1 | 1 | 23 | 2
1000003| 1 | 1 | 24 | 5
1000008| 1 | 1 | 14 | 4
Now in the result set I want to take only top 1 sys_id for each record and delete the others with same ID,Group,Fld and val (remove its dublicates). I know how to do this with cursors, but is there any way to do such operation in a single query?
Please try:
;with c as
(
select *, row_number() over(partition by ID, Group, Fld, val order by ID, Group, Fld, val) as n
from YouTable
)
delete from c
where n > 1

In MySQL: fetching rows distinct by year

I have a MySQL table similar to this:
| id | name | create_date |
---------------------------
| 1 | foo | 2003-03-11 |
| 2 | goo | 2003-04-27 |
| 3 | woo | 2004-10-07 |
| 4 | too | 2004-12-01 |
| 5 | hoo | 2005-04-20 |
| 6 | koo | 2006-01-12 |
| 7 | boo | 2006-04-17 |
| 8 | moo | 2006-08-19 |
I want to fetch all the latest yearly rows - one per year. So in the example above I'll get 2, 4, 5 and 8.
What's the right syntax?
Some of the other answers may work for you but this simple query does not require any joins
SELECT YEAR(create_date),
(SELECT id ORDER BY create_date DESC LIMIT 1)
FROM mytable
group by YEAR(create_date)
you can do something like
select * from table_name
where create_date in (
select max(create_date)
from table_name
group by year(create_date))
SELECT id FROM foo JOIN
(SELECT YEAR(create_date),MAX(create_date) AS md
FROM foo
GROUP BY YEAR(create_date)) as maxes
ON (create_date=md);
If you put an index on create_date, this will be fairly fast.
SELECT mi.*
FROM (
SELECT DISTINCT YEAR(created_date) AS dyear
FROM mytable
) md
JOIN mytable mi
ON mi.id =
(
SELECT id
FROM mytable ml
WHERE ml.create_date < CAST(CONCAT_WS('.', dyear + 1, 1, 1)) AS DATETIME)
ORDER BY
ml.create_date DESC
LIMIT 1
)
select id
from mytable
where not exists (
select * from mytable as T2
where T2.id = mytable.id
and T2.id >= year(created_date) + 1
)