Efficient query for average count in SQL

Efficient query for average count in SQL - sql

I have a table:
id age tag
1 26 green
1 26 blue
2 28 yellow
3 45 red
4 40 blue
4 40 green
5 50 red
I need to find both the average and standard deviation of the number of distinct tags per age group (I have coded this with CASE WHEN).
So to get for example:
average_age average_tags sd_tags age_group
26.7 1.5 0.1 25-34
40 2 0 35-44
45 1 0.01 44+
Is there a way to more efficiently calculate the average count of tags and also to avoid repeating the CASE code in the sub-query for efficiency / readability?
SELECT
AVG(age) as average_age,
AVG(countField) as average_spread,
STDDEV(countField) as sd_spread,
CASE
WHEN v.age BETWEEN 25 AND 34 THEN '25-34'
WHEN v.age BETWEEN 35 AND 44 THEN '35-44'
ELSE '44+'
END AS age_group,
FROM vid v
(
SELECT AVG(v.age),
COUNT(DISTINCT(p.tag)) AS countField,
CASE
WHEN v.age BETWEEN 25 AND 34 THEN '25-34'
WHEN v.age BETWEEN 35 AND 44 THEN '35-44'
ELSE '44+'
END AS age_group,
FROM vid v
GROUP BY age_group
) as q WHERE age IS NOT NULL
GROUP BY age_group ORDER BY age_group;

you can use this query
select AVG(T.Age) as average_age,AVG(countField) as average_spread,age_group
from (select v.Age,COUNT(DISTINCT(v.Tag)) as countField,CASE
WHEN v.age BETWEEN 25 AND 34 THEN '25-34'
WHEN v.age BETWEEN 35 AND 44 THEN '35-44'
ELSE '44+' END AS age_group
from VID v
group by v.Age) T
WHERE age IS NOT NULL
GROUP BY age_group ORDER BY age_group;

Related

Hive Summing up data in the table based on the date range

Have a table with the following schema design and the data residing inside it is like:
ID HITS MISS DDATE
1 10 3 20180101
1 33 21 20180122
1 84 11 20180901
1 11 2 20180405
1 54 23 20190203
1 33 43 20190102
4 54 22 20170305
4 56 88 20180115
5 87 22 20180809
5 66 48 20180617
5 91 53 20170606
DataTypes:
ID INT
HITS INT
MISS INT
DDATE STRING
The requirement is to calculate the total of the given (HITS and MISS) on yearly basis i.e 2017,2018,2019...
Written the following query:
SELECT ID,
SUM(HITS) AS HITS,SUM(MISS) AS MISS,
CASE
WHEN DDATE BETWEEN '201701' AND '201712' THEN '2017' ELSE
'NOTHING' END AS TTL_YR17_DATA
CASE
WHEN DDATE BETWEEN '201801' AND '201812' THEN '2018' ELSE
'NOTHING' END AS TTL_YR18_DATA
CASE
WHEN DDATE BETWEEN '201901' AND '201912' THEN '2019' ELSE
'NOTHING' END AS TTL_YR19_DATA
FROM
HST_TABLE
WHERE
DDATE BETWEEN '201801' AND '201812'
GROUP BY
ID,DDATE;
But, the query is not fetching the expected result.
Actual O/P:
1 10 3 2018
1 33 21 2018
1 84 11 2018
1 11 2 2018
1 54 23 2019
1 33 43 2019
4 54 22 2017
4 56 88 2018
5 87 22 2018
5 66 48 2018
5 91 53 2017
Expected O/P:
1 138 37 2018
4 56 88 2018
5 153 70 2018
1 87 66 2019
5 91 53 2017
Another related question:
Is there a way that I can avoid passing the DDATE range in the query? As this should be given by the user and shouldn't be hardcoded.
Any help/advice to achieve the above two requirements will be really helpful.

OK,it's easy to implement this with the substring function in HIVE, as below:
select
substring(dddate,0,4) as the_year,
id,
sum(hits) as hits_num,
sum(miss) as miss_num
from
hst_table
group by
substring(dddate,0,4),
id
order by
the_year,
id

The answer above by #Shawn.X is correct but has a logical flaw. Below is the corrected one:
select
substring(ddate,0,4) as the_year,
id,
sum(hits) as hits_num,
sum(miss) as miss_num
from
hst_table
group by
substring(ddate,0,4),
id
order by
the_year,
id;

Aggregate result from query by quarter SQL

Lets say I have a table which holds all exports for some time back in Microsoft SQL database:
Name:
ExportTable
Columns:
id - numeric(18)
exportdate - datetime
In order to get the number of exports per week I can run the following query:
SELECT DATEPART(ISO_WEEK,[exportdate]) as 'exportdate', count(exportdate) as 'totalExports'
FROM [ExportTable]
Group By DATEPART(ISO_WEEK,[exportdate])
order by exportdate;
Returns:
exportdate totalExports
---------- ------------
27 13
28 12
29 15
30 8
31 17
32 10
33 7
34 15
35 4
36 18
37 10
38 14
39 14
40 21
41 19
Would it be possible to aggregate the week results by quarter so the output becomes something like the bellow?
UPDATE
Sorry for not being crystal clear, I would like the current result to add upp with previous result up to a new quarter.
Note week 41 contains 21+19 = 40
Week 39 contains 157 (13+12+15+8+17+10+7+15+4+18+10+14+14)
exportdate totalExports Quarter
---------- ------------ -------
27 13 3
28 25 3
29 40 3
30 48 3
31 65 3
32 75 3
33 82 3
34 97 3
35 101 3
36 119 3
37 129 3
38 143 3
39 157 3 -- Sum of 3 Quarter values.
40 21 4 -- New Quarter show current week value
41 40 4 -- (21+19)

You can use this.
SELECT
DATEPART(ISO_WEEK,[exportdate]) as 'exportdate'
, SUM( count(exportdate) ) OVER ( PARTITION BY DATEPART(QUARTER,MIN([exportdate])) ORDER BY DATEPART(ISO_WEEK,[exportdate]) ROWS UNBOUNDED PRECEDING ) as 'totalExports'
, DATEPART(QUARTER,MIN([exportdate])) [Quarter]
FROM [ExportTable]
Group By DATEPART(ISO_WEEK,[exportdate])
order by exportdate;

You could use a case statement to separate the dates into quarters.
e.g.
CASE
WHEN EXPORT_DATE BETWEEN '1' AND '4' THEN 1
WHEN Export_Date BETWEEN '5' and '9' THEN 2
ELSE 0 AS [Quarter]
END
Its just an example but you get the idea.
You could then use the alias from the case

SELECT DATEPART(ISO_WEEK,[exportdate]) as 'exportdate', count(exportdate) as 'totalExports', DATEPART(quarter,[exportdate]) as quarter FROM [ExportTable] Group By DATEPART(ISO_WEEK,[exportdate]), DATEPART(quarter,[exportdate]) order by exportdate;

T-SQL Group by day date but i want show query full date

I want to show the date field can not group.
My Query:
SELECT DAY(T1.UI_CreateDate) AS DATEDAY, SUM(1) AS TOTALCOUNT
FROM mydb.dbo.LP_UseImpression T1 WHERE T1.UI_BR_BO_ID = 45
GROUP BY DAY(T1.UI_CreateDate)
Result:
DATEDAY TOTALCOUNT
----------- -----------
15 186
9 1
3 2
26 481
21 297
27 342
18 18
30 14
4 183
25 553
13 8
22 469
16 1
17 28
20 331
28 90
14 33
8 1
But i want to show the full date...
Example result:
DATEDAY TOTALCOUNT
----------- -----------
15/06/2015 186
9/06/2015 1
3/06/2015 2
26/06/2015 481
21/06/2015 297
27/06/2015 342
18/06/2015 18
30/06/2015 14
4/06/2015 183
25/06/2015 553
13/06/2015 8
22/06/2015 469
16/06/2015 1
17/06/2015 28
20/06/2015 331
28/06/2015 90
14/06/2015 33
8/06/2015 1
I want to see the results...
I could not get a kind of results...
How can I do?
Thanx!

How about just casting to date to remove any time component:
SELECT CAST(T1.UI_CreateDate as DATE) AS DATEDAY, COUNT(*) AS TOTALCOUNT
FROM mydb.dbo.LP_UseImpression T1
WHERE T1.UI_BR_BO_ID = 45
GROUP BY CAST(T1.UI_CreateDate as DATE)
ORDER BY DATEDAY;
SUM(1) for calculating the count does work. However, because SQL has the COUNT(*) function, it seems a bit awkward.

So you can group by DAY(T1.UI_CreateDate) or use full date for grouping. But these are different . As both these dates '2015-04-15' and '2015-12-15' result in same DAY value of 15.
Assuming you want to group on DAY rather than date please try the below version of query:
SELECT DISTINCT
T1.UI_CreateDate as DATEDAY,
count(1) over (PARTITION BY DAY(T1.UI_CreateDate) ) AS TOTALCOUNT
FROM mydb.dbo.LP_UseImpression T1 WHERE T1.UI_BR_BO_ID = 45
sql fiddle for demo: http://sqlfiddle.com/#!6/c3337/1

Oracle SQL weeks sales SUM

I have the sales data in terms of week:
ITEM LOC WEEK SALES
111 39 16/05/2015 10
222 39 16/05/2015 23
111 39 09/05/2015 13
222 39 09/05/2015 33
I want the sum of SALES column for the last 4 weeks.
So it comes like:
ITEM LOC 4-WEEKS-SALES
111 39 23
222 39 56

Just filter for last four weeks and agregate:
select ITEM, LOC,sum(SALES)
from theTable
where WEEK > SYSDATE - ( 7 * 4 )
group by ITEM,LOC

Try to this
select ITEM,LOC,sum(SALES) '4-WEEKS-SALES'
from tablename
where Datepart(wk, WEEK)>=(Datepart(wk, Getdate())-4)
Group by ITEM,LOC,Datepart(wk, WEEK)

how to get the unique records with min and max for each user

I have the following table:
id gender age highest weight lowest weight abc
a f 30 90 70 1.3
a f 30 90 65 null
a f 30 null null 1.3
b m 40 100 86 2.5
b m 40 null 80 2.5
c f 50 105 95 6.4
I need this result in sql server. What I need is the minimum of the weight and maximum of the weight and one record per user.
id gender age highest weight lowest weight abc
a f 30 90 65 1.3
b m 40 100 80 2.5
c f 50 105 95 6.4

Just do a grouping:
select id,
max(gender),
max(age),
max([highest weight]),
min([lowest weight]),
max(abc)
from SomeTable
group by id

You can do this using grouping:
select id, gender, max(highest_weight), min(lowwest_weight) from student
group by id, gender
But you need do define the rule for the other fields with variable value, like abc
Can you post more information?

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Efficient query for average count in SQL - sql

Related

Hive Summing up data in the table based on the date range

Aggregate result from query by quarter SQL

T-SQL Group by day date but i want show query full date

Oracle SQL weeks sales SUM

how to get the unique records with min and max for each user

Categories

Resources