Selecting rows with specific text to count and sum

Selecting rows with specific text to count and sum - sql

I would need to determine the percentage of specific values by year.
The dataset has values as follows:
Year Col Value
2012 -20 p, 12
2012 -20 points, d 20
2012 -20 points, by 24
2012 -20 p, new 32
2012 -30 p, 1256
2012 -30 points, d 32
2012 -30 points, by 42
2012 -30 p, new 164
There are other years but for the example I selected only 2012.
For each year, I would like to determine the percentage as:
Count of values having points word in the text
Divided by values starting with - 20
Same for case with 30.
Expected output for -20 in 2012:
(20+24)/(12+20+24+32)
I have tried as follows
Select year,
Col,
Count(0) as Value
, 100*count(0)/sum(count(case when Col like ‘-20%points%’ then 1 end) over (partition by year, substr(Col, 1,2))) as pct_20
/* Same for 40 poin
ts */
From table1
Where /* conditions */
Group by 1,2
But I got the error Ordered analytical functions can not be nested.

I think you want conditional aggregation:
select year, substr(col, 1, 2),
sum(case when col like '%points%' then value end) / sum(value)
from t
group by 1, 2;
Based on your comment:
select year, substr(col, 1, 2),
sum(case when col like '%points%' then 1.0 end) / count(*)
from t
group by 1, 2;

You can only nest an aggregate within an OLAP function, but not vice versa:
, 100*count(*)/NULLIF(sum(count(case when Col like ‘-20%points%’ then 1 end))
over (partition by year, substr(Col, 1,2)), 0) as pct_20

Related

SQL Group By weeks and months in the same time (Redshift)

In the code below I am selecting 42 days period and grouping it by SNAPSHOT_WEEK (where SNAPSHOT_WEEK has a number from 1 to 52(53) during the year).
SELECT
CASE
WHEN video_code = 'A' THEN 'Seller'
WHEN video_code = 'B' THEN 'Vendor'
WHEN video_code = 'C' THEN 'Others'
END AS CATEGORY
TO_CHAR(snapshot_time - DATE_PART('dow', snapshot_time)::int + 4, 'IW') AS SNAPSHOT_WEEK,
SUM(VIOLATION_COUNT)
FROM my_table
WHERE 1=1
AND snapshot_time BETWEEN '20180505'::date - '41 days'::interval AND '20180505'::date -- to calculate WoW
GROUP BY
CATEGORY, SNAPSHOT_WEEK;
Output for this query looks like this:
CATEGORY WEEK OR MONTH SUM_VIOLATION_COUNT
A 14 954
B 14 454
C 14 299
A 15 954
B 16 454
Is it possible, in the same query, beside grouping by week, group this data by month where month should start from 28th of one month to 28th of second month?
For example, in my output I need column that will show following values:
CATEGORY WEEK OR MONTH SUM_VIOLATION_COUNT
A 14 954
B 14 454
C 14 299
A 15 954
B 16 454
C 17 299
A 28 March 9354
B 28 March 2454
C 28 March 5354
A 28 April 1354
...... ..... .....
Where "28 March" - means number of violation between 28-Feb and 28 March; "28 April" - number of violation between 28 Feb and 28 April etc.
Is that possible to do using the same query?

You can do that with WITH Subquery, this will allow you do to run the query once on the database and group by twice based on your logic.
Your query has some disconnects between your column names but again it will look like something like this
P.S. Union requires number of columns should be same in both selects
WITH ALLDATA AS (
SELECT
CASE
WHEN video_code = 'A' THEN 'Seller'
WHEN video_code = 'B' THEN 'Vendor'
WHEN video_code = 'C' THEN 'Others'
END AS CATEGORY
TO_CHAR(snapshot_time - DATE_PART('dow', snapshot_time)::int + 4, 'IW') AS SNAPSHOT_WEEK,
SUM(VIOLATION_COUNT) SUM_VIOLATION_COUNT
FROM my_table
WHERE 1=1
AND snapshot_time BETWEEN '20180505'::date - '41 days'::interval AND '20180505'::date -- to calculate WoW
GROUP BY
CATEGORY, SNAPSHOT_WEEK)
SELECT CATEGORY, SNAPSHOT_WEEK, SUM_VIOLATION_COUNT FROM ALLDATA
UNION
SELECT CATEGORY, SNAPSHOT_WEEK, SUM_VIOLATION_COUNT FROM ALLDATA
GROUP BY <your month grouping logic>
To reiterate the logic in pseudo code
WITH ALLDATA AS (
SELECT <your base data without group by> )
SELECT columns FROM ALLDATA
GROUP BY <weekly group by logic>
UNION
SELECT columns FROM ALLDATA
GROUP BY <monthly group by logic>

You would need to UNION the output of two separate queries to generate those results.
The basic rule is that one input row will map to (at most) one output row.

How to use lead lag function in oracle

I have written some query to get my resultant result as below :
Note: I have months starting from jan-2016 to jan-2018.
There are two types, either 'hist' or 'future'
Resultant dataset :
In this example : let consider combination of id1+id2+id3 as 1,2,3
type month id1 id2 id3 value
hist jan-17 1 2 3 10
hist feb-17 1 2 3 20
future jan-17 1 2 3 15
future feb-17 1 2 3 1
hist mar-17 1 2 3 2
future apr-17 1 2 3 5
My calculation logic depends on the quarter number of month .
For eg . for month of january(first month of quarter) i want the value to be : future of jan + future value of feb + future value of march .
so for jan-17 , output should be : 15+1 + 0(for march there is no corresponding future value)
for the month of feb (2nd month of quarter), value should be : hist of jan + future of feb + future of march i.e 10+1+0(future of march is not available)
Similarly for the month of march , value should be : history of jan + history of feb + future of march i.e 10+20+0(frecast of march no present) .
similarly for april,may.june(depending on quarter number of month)
I am aware of the lead lag function , but I am not able to apply it here
Can someone please help

I would not mess with lag, this can all be done with a group by if you convert your dates to quarters:
WITH
dset
AS
(SELECT DATE '2017-01-17' month, 5 VALUE
FROM DUAL
UNION ALL
SELECT DATE '2017-02-17' month, 6 VALUE
FROM DUAL
UNION ALL
SELECT DATE '2017-03-25' month, 7 VALUE
FROM DUAL
UNION ALL
SELECT DATE '2017-05-25' month, 4 VALUE
FROM DUAL)
SELECT SUM (VALUE) value_sum, TO_CHAR (month, 'q') quarter, TO_CHAR (month, 'YYYY') year
FROM dset
GROUP BY TO_CHAR (month, 'q'), TO_CHAR (month, 'YYYY');
This results in:
VALUE_SUM QUARTER YEAR
18 1 2017
4 2 2017
We can use an analytic function if you need the result on each record:
SELECT SUM (VALUE) OVER (PARTITION BY TO_CHAR (month, 'q'), TO_CHAR (month, 'YYYY')) quarter_sum, month, VALUE
FROM dset
This results in:
QUARTER_SUM MONTH VALUE
18 1/17/2017 5
18 2/17/2017 6
18 3/25/2017 7
4 5/25/2017 4
Make certain you include year, you don't want to combine quarters from different years.

Well, as said in one of the comments.. the trick lies in another question of yours & the corresponding answer. Well... it goes somewhat like this..
with
x as
(select 'hist' type, To_Date('JAN-2017','MON-YYYY') ym , 10 value from dual union all
select 'future' type, To_Date('JAN-2017','MON-YYYY'), 15 value from dual union all
select 'future' type, To_Date('FEB-2017','MON-YYYY'), 1 value from dual),
y as
(select * from x Pivot(Sum(Value) For Type in ('hist' as h,'future' as f))),
/* Pivot for easy lag,lead query instead of working with rows..*/
z as
(
select ym,sum(h) H,sum(f) F from (
Select y.ym,y.H,y.F from y
union all
select add_months(to_Date('01-JAN-2017','DD-MON-YYYY'),rownum-1) ym, 0 H, 0 F
from dual connect by rownum <=3 /* depends on how many months you are querying...
so this dual adds the corresponding missing 0 records...*/
) group by ym
)
select
ym,
Case
When MOD(Extract(Month from YM),3) = 1
Then F + Lead(F,1) Over(Order by ym) + Lead(F,2) Over(Order by ym)
When MOD(Extract(Month from YM),3) = 2
Then Lag(H,1) Over(Order by ym) + F + Lead(F,1) Over(Order by ym)
When MOD(Extract(Month from YM),3) = 3
Then Lag(H,2) Over(Order by ym) + Lag(H,1) Over(Order by ym) + F
End Required_Value
from z

Transpose data in SQL

Can someone assist me on this?
As you can see from the first picture (Original data) I have date in format "Mar-12" and data for 2014,2015,2016 and 2017 year.
Now, I need to insert new column "year" where I need to put the year from Jan-14, Jan-15, Jan-16, Feb-16 etc.
Basically, I need some kind of data transpose, I think.
In the second picture "Final Order" I show in which order I need the data.
I don't know what is dbms.
So, this is how my data (original) looks like:
Customer|Section|Data|Jan-14|Feb-14|Jan-15|Feb-15
Total Fore SR 10 20 30 35
Total Fore TK 5 4 12 10
===================================================
And I need to put the data in this form:
Customer|Section|Data| Year |Jan|Feb|
Total Fore SR 2014 10 20
Total Fore TK 2014 5 4
Total Fore SR 2015 30 35
Total Fore TK 2015 12 10

Given your sample of
create table t (Customer varchar(5),Section varchar(4), Data varchar(2), [Jan-14] int , [Feb-14] int, [Jan-15] int, [Feb-15] int)
insert into #t values
('Total' , 'Fore' , 'SR' , 10 , 20, 30, 35),
('Total' , 'Fore' , 'TK' , 5 , 4, 12, 10)
you can solve this if your sql dialect is ms sql server by unpivoting and then grouping by like so
select customer,section,data,yyyy,
sum(case when mm='Jan' then dts else 0 end) as 'Jan',
sum(case when mm='Feb' then dts else 0 end) as 'Feb'
from
(
select customer,section,data,
dummy,
substring(dummy,1,3) as mm,
concat('20',substring(dummy,5,2)) as yyyy,
dts
from
(
select customer,section,data,
[Jan-14] , [Feb-14] , [Jan-15] , [Feb-15]
from t
) pvt
UNPIVOT
(dts FOR dummy IN
([Jan-14] , [Feb-14] , [Jan-15] , [Feb-15])
)AS unpvt
) x
group by customer,section,yyyy,data
result
customer section data yyyy Jan Feb
-------- ------- ---- ---- ----------- -----------
Total Fore SR 2014 10 20
Total Fore TK 2014 5 4
Total Fore SR 2015 30 35
Total Fore TK 2015 12 10
If your sql dialect does not have unpivot you can
select customer,section,data,yyyy,
sum(case when mm='Jan' then dts else 0 end) as 'Jan',
sum(case when mm='Feb' then dts else 0 end) as 'Feb'
from
(
select customer,section,data,2014 as yyyy,'Jan' as mm,[Jan-14] as dts from t
union all
select customer,section,data,2014 as yyyy,'Feb' as mm,[Feb-14] as dts from t
union all
select customer,section,data,2015 as yyyy,'Jan' as mm,[Jan-15] as dts from t
union all
select customer,section,data,2015 as yyyy,'Feb' as mm,[Feb-15] as dts from t
) x
group by customer,section,yyyy,data
Clearly either method is a pain if you have an unknown/variable number/lots of columns in which case you would need write a script to generate a sql statement for submission to dynamic sql.

more than one AVG column with diffrent conditions

I have a table as follows:
id year value
1 2012 10
2 2013 7
3 2013 7
4 2014 8
5 2014 10
6 2015 6
7 2011 12
I need to write a query which gives the AVG value of the last 4 years from today. Meaning that if today is 2016 then the AVG is on 2015,2014,2013.
Basicly this could be done with 3 queries:
Select avg(value) as a
from tab
where year=2015
and
Select avg(value) as b
from tab
where year=2014
and
Select avg(value) as c
from tab
where year=2013
The results based on the given values should be:
2013 7
2014 9
2015 6
Since all of them is on the same table... How can I do that in one query (postgresql)?
it should be without a WHERE.
Something like:
Select avg(with condition) as a, avg(with condition) as b, avg(with condition) as c
from tab

You can group by year and constrict to the years you want in your where clause
select avg(value), year
from tab
where year in (2013,2014,2015)
group by year
The query above will give you 3 separate rows. If you prefer a single row then you can use conditional aggregation instead of a group by
select
avg(case when year = 2013 then value end) as avg_2013,
avg(case when year = 2014 then value end) as avg_2014,
avg(case when year = 2015 then value end) as avg_2015,
from tab
where year in (2013,2014,2015)

select
avg(case when year = date_part('year', NOW()) then value end) as avg_2016,
avg(case when year = ((date_part('year', NOW())) - 1 ) then value end) as avg_2015,
avg(case when year = ((date_part('year', NOW())) - 2 ) then value end) as avg_2014,
avg(case when year = ((date_part('year', NOW())) - 3 ) then value end) as avg_2013
from tab

Query the Minimum Value per day within a month's worth of data

I have two sets of pricing data (A and B). Set A consists of all of my pricing data per order over a month. Set B consists of all of my competitor's pricing data over the same month. I want to compare my competitor's lowest price to each of my prices per day.
Graphically, the data appears like this:
Date:-- Set A: -- Set B:
1---------25---------31
1---------54---------47
1---------23---------56
1---------12---------23
1---------76---------40
1---------42
I want pass only the lowest price to a case statement which evaluates which prices are better. I would like to process an entire month's worth of data all at one time, so in my example, Dates 1 thru 30(1) would be included and crunched all at once, and for each day, there would only be one value from set B included: the lowest price in the set.
Important notes: Set B does not have a datapoint for each point in Set A
Hopefully this makes sense. Thanks in advance for any help you may be able to render.

That's a strange example you have - do you really have prices ranging from 12 to 76 within a single day?
Anyway, left joining your (grouped) data with their (grouped) data should work (untested):
with
my_prices as (
select price_date, min(price_value) min_price from my_prices group by price_date),
their_prices as (
select price_date, min(price_value) min_price from their_prices group by price_date)
select
mine.price_date,
(case
when theirs.min_price is null then mine.min_price
when theirs.min_price >= mine.min_price then mine.min_price
else theirs.min_price
end) min_price
from
my_min_prices mine
left join their_prices theirs on mine.price_date = theirs.price_date

I'm still not sure that I understand your requirements. My best guess is that you want something like
SQL> ed
Wrote file afiedt.buf
1 with your_data as (
2 select 1 date_id, 25 price_a,31 price_b from dual
3 union all
4 select 1, 54, 47 from dual union all
5 select 1, 23, 56 from dual union all
6 select 1, 12, 23 from dual union all
7 select 1, 76, 40 from dual union all
8 select 1, 42, null from dual)
9 select date_id,
10 sum( case when price_a < min_price_b
11 then 1
12 else 0
13 end) better,
14 sum( case when price_a = min_price_b
15 then 1
16 else 0
17 end) tie,
18 sum( case when price_a > min_price_b
19 then 1
20 else 0
21 end) worse
22 from( select date_id,
23 price_a,
24 min(price_b) over (partition by date_id) min_price_b
25 from your_data )
26* group by date_id
SQL> /
DATE_ID BETTER TIE WORSE
---------- ---------- ---------- ----------
1 1 1 4

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Selecting rows with specific text to count and sum - sql

I think you want conditional aggregation: select year, substr(col, 1, 2), sum(case when col like '%points%' then value end) / sum(value) from t group by 1, 2; Based on your comment: select year, substr(col, 1, 2), sum(case when col like '%points%' then 1.0 end) / count(*) from t group by 1, 2;

You can only nest an aggregate within an OLAP function, but not vice versa: , 100count()/NULLIF(sum(count(case when Col like ‘-20%points%’ then 1 end)) over (partition by year, substr(Col, 1,2)), 0) as pct_20

Related

SQL Group By weeks and months in the same time (Redshift)

How to use lead lag function in oracle

Transpose data in SQL

more than one AVG column with diffrent conditions

Query the Minimum Value per day within a month's worth of data

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Selecting rows with specific text to count and sum - sql

I think you want conditional aggregation: select year, substr(col, 1, 2), sum(case when col like '%points%' then value end) / sum(value) from t group by 1, 2; Based on your comment: select year, substr(col, 1, 2), sum(case when col like '%points%' then 1.0 end) / count(*) from t group by 1, 2;

You can only nest an aggregate within an OLAP function, but not vice versa: , 100*count(*)/NULLIF(sum(count(case when Col like ‘-20%points%’ then 1 end)) over (partition by year, substr(Col, 1,2)), 0) as pct_20

Related

SQL Group By weeks and months in the same time (Redshift)

How to use lead lag function in oracle

Transpose data in SQL

more than one AVG column with diffrent conditions

Query the Minimum Value per day within a month's worth of data

Categories

Resources

You can only nest an aggregate within an OLAP function, but not vice versa: , 100count()/NULLIF(sum(count(case when Col like ‘-20%points%’ then 1 end)) over (partition by year, substr(Col, 1,2)), 0) as pct_20