I want to group data without any specific criteria, just the number of data for each resultant group. I have a table like this:
DATE VAL1 VAL2
------------ ------ ------
01-01-2013 5 8
01-02-2013 14 23
01-03-2013 10 6
01-04-2013 21 88
01-05-2013 9 11
01-06-2013 4 9
01-07-2013 19 42
01-08-2013 8 4
01-09-2013 12 1
01-10-2013 2 8
01-11-2013 31 65
01-12-2013 3 6
...
Think that date field could be, for example, a number and not a date...
What I want is, for example, get the total sum or average of groups of data, where groups have a specific number of rows (the same number for all groups).
For example, for three rows per group, where I want get the total sum of VAL1 and the average of VAL2:
INTERVAL SUM VAL 1 AVG VAL 2
----------------------- --------- ---------
01-01-2013 - 01-03-2013 29 12.3
01-04-2013 - 01-06-2013 34 33.3
01-07-2013 - 01-09-2013 39 15.6
01-10-2013 - 01-12-2013 36 26.3
...
I really think it's possible to do with a query, but I can't find the way to get the proper "group by" sentence. Can somebody help me?
Thanks a lot in advance!
You can use row_number function divided by 3 to assign unique number to each group of 3 consecutive rows. Then, you can aggregate on this group number.
select min("DATE") ||'-'||max("DATE"),
sum(val1),
avg(val2)
from (
select "DATE", val1, val2,
ceil(row_number() over (order by "date") / 3) as grp
from mytab
) as x
group by grp
order by grp;
Please try:
select
min("DATE")||' - '||max("DATE") "Interval",
sum(Val1) "SUM VAL 1",
cast(avg(Val2) as numeric(18,1)) "AVG VAL 2"
from(
select
"DATE",
ceil(extract(month from "DATE")/3) dt,
Val1,
Val2
from
YourTable
)x
group by dt
order by dt
SQL Fiddle Demo
The ROWNUM pseudo column in oracle or the LIMIT of mysql could help you acheive it.
I think what you mean is pagination. Given in this link.
http://www.oracle.com/technetwork/issue-archive/2006/06-sep/o56asktom-086197.html
Related
I have a table, consisting of 3 columns (Person, Year and Count), so for each person, there are several rows with different years and counts and the final row with total count. I want to keep the table ordered by Name, but also order it by the total count.
So the rows should be ordered by sum, but also grouped by the Person and ordered by year. When I am trying to order by sum, of course, both person and years are messed up. Is there a way to sort like this?
You've stored those "total" rows as well? Gosh! Why did you do that?
Anyway: if you
compute rank for rows whose year column is equal to 'total' and
add case expression into the order by clause,
you might get what you want:
SQL> with sorter as
2 (select name, cnt,
3 rank() over (order by cnt) rnk
4 from test
5 where year = 'total'
6 )
7 select t.*
8 from test t join sorter s on s.name = t.name
9 order by s.rnk, case when year = 'total' then '9'
10 else year
11 end;
NAME YEAR CNT
---- ----- ----------
John 2018 3
John 2019 2
John total 5
Bob 2017 2
Bob 2019 4
Bob total 6
6 rows selected.
SQL>
I have a data like below format in table:
Id EmployeeCode JobNumber TransferNo FromDate Todate
--------------------------------------------------------------------------
1 127 1.0 0 01-Mar-19 10-Mar-19
2 127 1.0 NULL 11-Mar-19 15-Mar-19
3 127 J-1 1 16-Mar-19 NULL
4 136 1.0 0 01-Mar-19 15-Mar-19
5 136 J-1 1 16-Mar-19 20-Mar-19
6 136 1.0 2 21-Mar-19 NULL
And I want result like this:
Id EmployeeCode JobNumber TransferNo FromDate Todate
--------------------------------------------------------------------------
2 127 1.0 NULL 01-Mar-19 15-Mar-19
3 127 J-1 1 16-Mar-19 NULL
4 136 1.0 0 01-Mar-19 15-Mar-19
5 136 J-1 1 16-Mar-19 20-Mar-19
6 136 1.0 2 21-Mar-19 NULL
The idea is
If Job is same in continuous than Single row with max id with min date and max date. For example, for employee 127 first job and second job number is same and second and third row is different, then the first and second row will be returned, with minimum fromdate and max todate, and third row will be returned as is.
If job number is different with its next job number than all rows will be returned.
For example: for employee 136: first job number is different with second, second is different with third, so all rows will be returned.
You can group by jobNumber and EmployeeCode and use the Max/Min-Aggregate-Functions to get the dates you want
I doubt you will get a result from simple set-based queries.
So my advice: Declare a cursor on SELECT DISTINCT EmployeeCode .... Within that cursor select all rows with that EmployeeCode. Work in this set to figure out your values and construct a resultset from that.
This is an example of a gaps and islands problem. The solution here is to define the "islands" by their starts, so the process is:
determine when a new grouping begins (i.e. no overlap with previous row)
do a cumulative sum of the the starts to get the grouping value
aggregate
This looks like
select max(id), EmployeeCode, JobNumber,
min(fromdate), max(todate)
from (select t.*,
sum(case when fromdate = dateadd(day, 1, prev_todate) then 0 else 1 end) over
(partition by EmployeeCode, JobNumber order by id
) as grouping
from (select t.*,
lag(todate) over (partition by EmployeeCode, JobNumber order by id) as prev_todate
from t
) t
) t
group by grouping, EmployeeCode, JobNumber;
It is unclear what the logic is for TransferNo. The simplest solution is just min() or max(), but that will not return NULL.
This is my table:
id user_id date balance
1 1 2016-05-10 10
2 1 2016-05-10 30
3 2 2017-04-24 5
4 2 2017-04-27 10
5 3 2017-11-10 40
I want to group the rows by user_id and sum the balance, but so that the sum is equal or less than 30. Moreover, I need to display the minimum date in the group. It should look like this:
id balance date_start
1-1 10 2016-05-10
1-2 30 2016-05-10
2-1 15 2017-04-24
Excuse for my language. Thanks.
You should be able to do so by using group by & having, here is an example of what may solve your case :
SELECT id, user_id, SUM(balance) as balance, data_start
FROM your_table
GROUP BY user_id
HAVING SUM(balance) >= 30
AND MIN(date_start)
This is a good way to do it with one query, but it is a complex query and you should be careful if using it on a very large tables.
I need to count a value (M_Id) at each change of a date (RS_Date) and create a column grouped by the RS_Date that has an active total from that date.
So the table is:
Ep_Id Oa_Id M_Id M_StartDate RS_Date
--------------------------------------------
1 2001 5 1/1/2014 1/1/2014
1 2001 9 1/1/2014 1/1/2014
1 2001 3 1/1/2014 1/1/2014
1 2001 11 1/1/2014 1/1/2014
1 2001 2 1/1/2014 1/1/2014
1 2067 7 1/1/2014 1/5/2014
1 2067 1 1/1/2014 1/5/2014
1 3099 12 1/1/2014 3/2/2014
1 3099 14 2/14/2014 3/2/2014
1 3099 4 2/14/2014 3/2/2014
So my goal is like
RS_Date Active
-----------------
1/1/2014 5
1/5/2014 7
3/2/2014 10
If the M_startDate = RS_Date I need to count the M_id and then for
each RS_Date that is not equal to the start date I need to count the M_Id and then add that to the M_StartDate count and then count the next RS_Date and add that to the last active count.
I can get the basic counts with something like
(Case when M_StartDate <= RS_Date
then [m_Id] end) as Test.
But I am stuck as how to get to the result I want.
Any help would be greatly appreciated.
Brian
-added in response to comments
I am using Server Ver 10
If using SQL SERVER 2012+ you can use ROWS with your the analytic/window functions:
;with cte AS (SELECT RS_Date
,COUNT(DISTINCT M_ID) AS CT
FROM Table1
GROUP BY RS_Date
)
SELECT *,SUM(CT) OVER(ORDER BY RS_Date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Run_CT
FROM cte
Demo: SQL Fiddle
If stuck using something prior to 2012 you can use:
;with cte AS (SELECT RS_Date
,COUNT(DISTINCT M_ID) AS CT
FROM Table1
GROUP BY RS_Date
)
SELECT a.RS_Date
,SUM(b.CT)
FROM cte a
LEFT JOIN cte b
ON a.RS_DAte >= b.RS_Date
GROUP BY a.RS_Date
Demo: SQL Fiddle
You need a cumulative sum, easy in SQL Server 2012 using Windowed Aggregate Functions. Based on your description this will return the expected result
SELECT p_id, RS_Date,
SUM(COUNT(*))
OVER (PARTITION BY p_id
ORDER BY RS_Date
ROWS UNBOUNDED PRECEDING)
FROM tab
GROUP BY p_id, RS_Date
It looks like you want something like this:
SELECT
RS_Date,
SUM(c) OVER (PARTITION BY M_StartDate ORDER BY RS_Date ROWS UNBOUNDED PRECEEDING)
FROM
(
SELECT M_StartDate, RS_Date, COUNT(DISTINCT M_Id) AS c
FROM my_table
GROUP BY M_StartDate, RS_Date
) counts
The inline view computes the counts of distinct M_Id values within each (M_StartDate, RS_Date) group (distinctness enforced only within the group), and the outer query uses the analytic version of SUM() to add up the counts within each M_StartDate.
Note that this particular query will not exactly reproduce your example results. It will instead produce:
RS_Date Active
-----------------
1/1/2014 5
1/5/2014 7
3/2/2014 8
3/2/2014 2
This is on account of some rows in your example data with RS_Date 3/2/2014 having a later M_StartDate than others. If this is not what you want then you need to clarify the question, which currently seems a bit inconsistent.
Unfortunately, analytic functions are not available until SQL Server 2012. In SQL Server 2010, the job is messier. It could be done like this:
WITH gc AS (
SELECT M_StartDate, RS_Date, COUNT(DISTINCT M_Id) AS c
FROM my_table
GROUP BY M_StartDate, RS_Date
)
SELECT
RS_Date,
(
SELECT SUM(c)
FROM gc2
WHERE gc2.M_StartDate = gc.M_StartDate AND gc2.RS_Date <= gc.RS_Date
) AS Active
FROM gc
If you are using SQL 2012 or newer you can use LAG to produce a running total.
https://msdn.microsoft.com/en-us/library/hh231256(v=sql.110).aspx
If data is in the following format:
SID TID Tdatetime QID QTotal
----------------------------------------
100 1 01/12/97 9:00AM 66 110
100 1 01/12/97 9:00AM 66 110
100 1 01/12/97 10:00AM 67 110
100 2 01/19/97 9:00AM 66 .
100 2 01/19/97 9:00AM 66 110
100 2 01/19/97 10:00AM 66 110
100 3 01/26/97 9:00AM 68 120
100 3 01/26/97 9:00AM 68 120
110 1 02/03/97 10:00AM 68 110
110 3 02/12/97 9:00AM 64 115
110 3 02/12/97 9:00AM 64 115
120 1 04/05/97 9:00AM 66 105
120 1 04/05/97 10:00AM 66 105
I would like to be able to write a query to sum the QTotal column for all rows and find the count of duplicate rows for the Tdatetime column.
The output would look like:
Year Total Count
97 | 1340 | 4
The third column in the result does not include the count of distinct rows in the table. And the output is grouped by the year in the TDateTime column.
The following query may help:
SELECT
'YEAR ' + CAST(sub.theYear AS VARCHAR(4)),
COUNT(sub.C),
(SELECT SUM(QTotal) FROM MyTable WHERE YEAR(Tdatetime) = sub.theYear) AS total
FROM
(SELECT
YEAR(Tdatetime) AS theYear,
COUNT(Tdatetime) AS C
FROM MyTable
GROUP BY Tdatetime, YEAR(Tdatetime)
HAVING COUNT(Tdatetime) >= 2) AS sub
This will work if you really want to group by the tDateTime column:
SELECT DISTINCT tDateTime, SUM(QTotal), Count(distinct tDateTime)
FROM Table
GROUP BY tDateTime
HAVING Count(distinct tDateTime) > 1
But your results look like you want to group by the Year in the tDateTime column. Is this correct?
If so try this:
SELECT DISTINCT YEAR (tDateTime), SUM(QTotal), Count(distinct tDateTime)
FROM Table
GROUP BY YEAR (tDateTime)
HAVING Count(distinct tDateTime) > 1
You must do SELECT from this table GROUPing by QTotal, using COUNT(subSELECT from this table WHERE QTotal is the same). If I only I had time I would write you SQL statement, but it'll take some minutes.
Something like:
select Year(Tdatetime) ,sum(QTotal), count(1) from table group by year(Tdatetime )
or full date
select Tdatetime ,sum(QTotal), count(1) from table group by year(Tdatetime)
Or your ugly syntax ( : ) )
select 'Year ' + cast(Year(tdatetime) as varchar(4))
+ '|' + cast(sum(QTotal) as varchar(31))
+ '|' + cast(count(1) as varchar(31))
from table group by year(Tdatetime )
Or do you want just the year? Sum all columns? Or just by year?
SELECT
YEar + year(Tdatetime),
SUM ( QTotal ),
(SELECT COUNT(*) FROM (
SELECT Tdatetime FROM tDateTime GROUP BY Tdatetime
HAVING COUNT(QID) > 1) C
FROM
Tdatetime t
GROUP BY
YEar + year(Tdatetime)
This is the first time I have asked a question on stackoverflow. It looks like I have lost my original ID info. I had to register to login and add comments to the question I posted.
To answer OMG Ponies question, this is a SQL Server 2008 database.
#Abe Miessler , the row with SID 120 does not contain duplicates. the first row for SID 120 shows 9:00AM in the datetime column , and the second row shows 10:00AM.
#Zafer, your query is the accepted answer. I made a few minor tweaks to get it to work. Thanks.
Thanks due to Abe Miessler and the others for your help.