Bigquery SQL MAX() text value - sql

Thanks for taking the time to look at this.
How to get the MAX of a text value in column A. I would like to and another where clause to show only one of the "FY20-Q4M#" (MAX value only)
Current Table
YY-QQ_STATUS
Program
FY20-Q2_ACTUALS
XYZ
FY20-Q3_ACTUALS
XYZ
FY20-Q3_BUDGET
XYZ
FY20-Q4M0
XYZ
FY20-Q4M1
XYZ
FY20-Q4M2
XYZ
FY20-Q4_BUDGET
XYZ
FY20-Q4_OUTLOOK
XYZ
Goal:
YY-QQ_STATUS
Program
FY20-Q2_ACTUALS
XYZ
FY20-Q3_ACTUALS
XYZ
FY20-Q3_BUDGET
XYZ
FY20-Q4M2
XYZ
FY20-Q4_BUDGET
XYZ
FY20-Q4_OUTLOOK
XYZ
SELECT
CASE WHEN UPPER(SCENARIO) LIKE '%ACTUALS%' THEN CONCAT(LEFT(FISCAL_YEAR,4) ,'-', QUARTER, '_ACTUALS')
WHEN UPPER(SCENARIO) LIKE 'Q%M%' THEN CONCAT(LEFT(FISCAL_YEAR,4) ,'-', QUARTER, SUBSTR(SCENARIO,3,2))
WHEN UPPER(SCENARIO) LIKE '%BUDGET%' THEN CONCAT(LEFT(FISCAL_YEAR,4) ,'-', QUARTER, '_BUDGET')
END AS SCENARIO,
CONCAT(LEFT(FISCAL_YEAR,4) ,'-', QUARTER) AS QUARTER,
PROGRAM_NAME,
FROM [XXXXFinance.FINANCE_OPS_REPORT_V] FIN
WHERE PROGRAM_NAME = 'Kiev20'
AND (XYZ_PER_NAME_QUARTER >= (SELECT XYZ_PER_NAME_QUARTER FROM [XXXXMaster_Data.CALENDAR]
WHERE CALENDAR_DATE_STR = CURRENT_DATE())
OR XYZ_PER_NAME_QUARTER < (SELECT XYZ_PER_NAME_QUARTER FROM [XXXXMaster_Data.CALENDAR]
WHERE CALENDAR_DATE_STR = CURRENT_DATE()) AND (SCENARIO CONTAINS 'Actual' OR SCENARIO CONTAINS 'Budget')
OR XYZ_PER_NAME_QUARTER < (SELECT XYZ_PER_NAME_QUARTER FROM [XXXXMaster_Data.CALENDAR]
WHERE CALENDAR_DATE_STR = CURRENT_DATE()) AND (SCENARIO CONTAINS 'Q%M%'))
group by 1,2,3
order by SCENARIO```

Try below
select
ifnull(format('FY%s-Q%sM%s', any_value(year), any_value(quartal), max(month)), any_value(YY_QQ_STATUS)) as YY_QQ_STATUS,
Program
from (
select *,
regexp_extract(YY_QQ_STATUS, r'FY(\d\d)-Q\dM\d') year,
regexp_extract(YY_QQ_STATUS, r'FY\d\d-Q(\d)M\d') quartal,
regexp_extract(YY_QQ_STATUS, r'FY\d\d-Q\dM(\d)') month
from `project.XXXXFinance.FINANCE_OPS_REPORT_V`
)
group by Program, ifnull(format('FY%s-Q%s', year, quartal), YY_QQ_STATUS)
if applied to sample data in y our question - output is

If you want one of the values, you could use row_number() and choose one row:
select t.* except (seqnum)
from (select t.*,
row_number() over (partition by left(YY_QQ_STATUS, 7) order by YY_QQ_STATUS desc) as seqnum
from t
) t
where not (YY_QQ_STATUS like 'FY20-Q4%' and seqnum > 1);

Related

Interview question:How to get last 3 month aggregation at column level?

This is the question i was being asked at Apple onsite interview and it blew my mind. Data is like this:
orderdate,unit_of_phone_sale
20190806,3000
20190704,3789
20190627,789
20190503,666
20190402,765
I had to write a query to get the result for each month sale, we should have last 3 month sales values. Let me put the expected output here.
order_monnth,M-1_Sale, M-2_Sale, M-3_Sale
201908,3000,3789,789,666
201907,3789,789,666,765
201906,789,666,765,0
201905,666,765,0,0
201904,765,0,0
I could only got the month wise sale and and used case statement by hardcoding month which was wrong. I banged my head to write this sql, but i could not.
Can anyone help on this. It will be really helpful for me to prepare for sql interviews
Update: This is what i tried
with abc as(
select to_char(order_date,'YYYYMM') as yearmonth,to_char(order_date,'YYYY') as year,to_char(order_date,'MM') as moth, sum(unit_of_phone_sale) as unit_sale
from t1 group by to_char(order_date,'YYYYMM'),to_char(order_date,'YYYY'),to_char(order_date,'MM'))
select yearmonth, year, case when month=01 then unit_sale else 0 end as M1_Sale,
case when month=02 then unit_sale else 0 end as M2_Sale...
case when month=12 then unit_sale else 0 end as M12_Sale
from abc
You will first of all need to sum the month's data and then use the LAG function to get previous months' data as following:
SELECT
ORDER_MONTH,
LAG(UNIT_OF_PHONE_SALE, 1) OVER(
ORDER BY
ORDER_MONTH
) AS "M-1_Sale",
LAG(UNIT_OF_PHONE_SALE, 2) OVER(
ORDER BY
ORDER_MONTH
) AS "M-2_Sale",
LAG(UNIT_OF_PHONE_SALE, 3) OVER(
ORDER BY
ORDER_MONTH
) AS "M-3_Sale"
FROM
(
SELECT
TO_CHAR(ORDERDATE, 'YYYYMM') AS ORDER_MONTH,
SUM(UNIT_OF_PHONE_SALE) AS UNIT_OF_PHONE_SALE
FROM
DATAA
GROUP BY
TO_CHAR(ORDERDATE, 'YYYYMM')
)
ORDER BY
ORDER_MONTH DESC;
Output:
ORDER_ M-1_Sale M-2_Sale M-3_Sale
------ ---------- ---------- ----------
201908 3789 789 666
201907 789 666 765
201906 666 765
201905 765
201904
db<>fiddle demo
Cheers!!
-- Update --
For the requirement mentioned in the comments, Following query will work for it.
CTE AS (
SELECT
TRUNC(ORDERDATE, 'MONTH') AS ORDER_MONTH,
SUM(UNIT_OF_PHONE_SALE) AS UNIT_OF_PHONE_SALE
FROM
DATAA
GROUP BY
TRUNC(ORDERDATE, 'MONTH')
)
SELECT
TO_CHAR(C.ORDER_MONTH,'YYYYMM') as ORDER_MONTH,
NVL(C1.UNIT_OF_PHONE_SALE, 0) AS "M-1_Sale",
NVL(C2.UNIT_OF_PHONE_SALE, 0) AS "M-2_Sale",
NVL(C3.UNIT_OF_PHONE_SALE, 0) AS "M-3_Sale"
FROM
CTE C
LEFT JOIN CTE C1 ON ( C1.ORDER_MONTH = ADD_MONTHS(C.ORDER_MONTH, - 1) )
LEFT JOIN CTE C2 ON ( C2.ORDER_MONTH = ADD_MONTHS(C.ORDER_MONTH, - 2) )
LEFT JOIN CTE C3 ON ( C3.ORDER_MONTH = ADD_MONTHS(C.ORDER_MONTH, - 3) )
ORDER BY
C.ORDER_MONTH DESC
Output:
db<>fiddle demo of updated answer.
Cheers!!
I think LEAD function can help here -
SELECT TO_CHAR(orderdate, 'YYYYMM') "DATE"
,unit_of_phone_sale M_1_Sale
,LEAD(unit_of_phone_sale,1,0) OVER(ORDER BY TO_CHAR(orderdate, 'YYYYMM') DESC) M_2_Sale
,LEAD(unit_of_phone_sale,2,0) OVER(ORDER BY TO_CHAR(orderdate, 'YYYYMM') DESC) M_3_Sale
,LEAD(unit_of_phone_sale,3,0) OVER(ORDER BY TO_CHAR(orderdate, 'YYYYMM') DESC) M_4_Sale
FROM table_sales
Here is the DB Fiddle
You can use this query:
select a.order_month, a.unit_of_phone_sale,
LEAD(unit_of_phone_sale, 1, 0) OVER (ORDER BY rownum) AS M_1,
LEAD(unit_of_phone_sale, 2, 0) OVER (ORDER BY rownum) AS M_2,
LEAD(unit_of_phone_sale, 3, 0) OVER (ORDER BY rownum) AS M_3
from (
select TO_CHAR(orderdate, 'YYYYMM') order_month,
unit_of_phone_sale,
rownum
from Y
order by order_month desc) a

Group by in columns and rows, counts and percentages per day

I have a table that has data like following.
attr |time
----------------|--------------------------
abc |2018-08-06 10:17:25.282546
def |2018-08-06 10:17:25.325676
pqr |2018-08-05 10:17:25.366823
abc |2018-08-06 10:17:25.407941
def |2018-08-05 10:17:25.449249
I want to group them and count by attr column row wise and also create additional columns in to show their counts per day and percentages as shown below.
attr |day1_count| day1_%| day2_count| day2_%
----------------|----------|-------|-----------|-------
abc |2 |66.6% | 0 | 0.0%
def |1 |33.3% | 1 | 50.0%
pqr |0 |0.0% | 1 | 50.0%
I'm able to display one count by using group by but unable to find out how to even seperate them to multiple columns. I tried to generate day1 percentage with
SELECT attr, count(attr), count(attr) / sum(sub.day1_count) * 100 as percentage from (
SELECT attr, count(*) as day1_count FROM my_table WHERE DATEPART(week, time) = DATEPART(day, GETDate()) GROUP BY attr) as sub
GROUP BY attr;
But this also is not giving me correct answer, I'm getting all zeroes for percentage and count as 1. Any help is appreciated. I'm trying to do this in Redshift which follows postgresql syntax.
Let's nail the logic before presenting:
with CTE1 as
(
select attr, DATEPART(day, time) as theday, count(*) as thecount
from MyTable
)
, CTE2 as
(
select theday, sum(thecount) as daytotal
from CTE1
group by theday
)
select t1.attr, t1.theday, t1.thecount, t1.thecount/t2.daytotal as percentofday
from CTE1 t1
inner join CTE2 t2
on t1.theday = t2.theday
From here you can pivot to create a day by day if you feel the need
I am trying to enhance the query #johnHC btw if you needs for 7days then you have to those days in case when
with CTE1 as
(
select attr, time::date as theday, count(*) as thecount
from t group by attr,time::date
)
, CTE2 as
(
select theday, sum(thecount) as daytotal
from CTE1
group by theday
)
,
CTE3 as
(
select t1.attr, EXTRACT(DOW FROM t1.theday) as day_nmbr,t1.theday, t1.thecount, t1.thecount/t2.daytotal as percentofday
from CTE1 t1
inner join CTE2 t2
on t1.theday = t2.theday
)
select CTE3.attr,
max(case when day_nmbr=0 then CTE3.thecount end) as day1Cnt,
max(case when day_nmbr=0 then percentofday end) as day1,
max(case when day_nmbr=1 then CTE3.thecount end) as day2Cnt,
max( case when day_nmbr=1 then percentofday end) day2
from CTE3 group by CTE3.attr
http://sqlfiddle.com/#!17/54ace/20
In case that you have only 2 days:
http://sqlfiddle.com/#!17/3bdad/3 (days descending as in your example from left to right)
http://sqlfiddle.com/#!17/3bdad/5 (days ascending)
The main idea is already mentioned in the other answers. Instead of joining the CTEs for calculating the values I am using window functions which is a bit shorter and more readable I think. The pivot is done the same way.
SELECT
attr,
COALESCE(max(count) FILTER (WHERE day_number = 0), 0) as day1_count, -- D
COALESCE(max(percent) FILTER (WHERE day_number = 0), 0) as day1_percent,
COALESCE(max(count) FILTER (WHERE day_number = 1), 0) as day2_count,
COALESCE(max(percent) FILTER (WHERE day_number = 1), 0) as day2_percent
/*
Add more days here
*/
FROM(
SELECT *, (count::float/count_per_day)::decimal(5, 2) as percent -- C
FROM (
SELECT DISTINCT
attr,
MAX(time::date) OVER () - time::date as day_number, -- B
count(*) OVER (partition by time::date, attr) as count, -- A
count(*) OVER (partition by time::date) as count_per_day
FROM test_table
)s
)s
GROUP BY attr
ORDER BY attr
A counting the rows per day and counting the rows per day AND attr
B for more readability I convert the date into numbers. Here I take the difference between current date of the row and the maximum date available in the table. So I get a counter from 0 (first day) up to n - 1 (last day)
C calculating the percentage and rounding
D pivot by filter the day numbers. The COALESCE avoids the NULL values and switched them into 0. To add more days you can multiply these columns.
Edit: Made the day counter more flexible for more days; new SQL Fiddle
Basically, I see this as conditional aggregation. But you need to get an enumerator for the date for the pivoting. So:
SELECT attr,
COUNT(*) FILTER (WHERE day_number = 1) as day1_count,
COUNT(*) FILTER (WHERE day_number = 1) / cnt as day1_percent,
COUNT(*) FILTER (WHERE day_number = 2) as day2_count,
COUNT(*) FILTER (WHERE day_number = 2) / cnt as day2_percent
FROM (SELECT attr,
DENSE_RANK() OVER (ORDER BY time::date DESC) as day_number,
1.0 * COUNT(*) OVER (PARTITION BY attr) as cnt
FROM test_table
) s
GROUP BY attr, cnt
ORDER BY attr;
Here is a SQL Fiddle.

How to get stockBalance of every end of the month

I have Four columns like
Date Customer InvoiceNo StockBalance
11/29/2017 A IN000414 5000
11/30/2017 B IN000415 4000
12/27/2017 A IN000416 3500
12/30/2017 B IN000417 2000
I want to get Stockbalance of every end of month, I need the output as
11/30/2017 B IN000415 4000
12/30/2017 B IN000417 2000
how could i get this could anybody guide to me?
You can use row_number() function :
select t.*
from (select *, row_number() over (partition by year(date), month(date) order by date desc) seq
from table
) t
where seq = 1;
EDIT : You want apply :
select t.*
from table t cross apply
( select top (1) t1.*
from table t1
where t1.Customer = t.Customer and
EOMONTH(t1.Dat) = t.Dat
order by t1.Dat desc
) t1;
Use row_number(), but be sure you include the year and month in the calculation:
select t.*
from (select t.*,
row_number() over (partition by year(date), month(date)
order by date desc
) as seqnum
from t
) t
where seqnum = 1;

SQL - values from two rows into new two rows

I have a query that gives a sum of quantity of items on working days. on weekend and holidays that quantity value and item value is empty.
I would like that on empty days is last known quantity and item.
My query is like this:
`select a.dt,b.zaliha as quantity,b.artikal as item
from
(select to_date('01-01-2017', 'DD-MM-YYYY') + rownum -1 dt
from dual
connect by level <= to_date(sysdate) - to_date('01-01-2017', 'DD-MM-YYYY') + 1
order by 1)a
LEFT OUTER JOIN
(select kolicina,sum(kolicina)over(partition by artikal order by datum_do) as zaliha,datum_do,artikal
from
(select sum(vv.kolicinaulaz-vv.kolicinaizlaz)kolicina,vz.datum as datum_do,vv.artikal
from vlpzaglavlja vz, vlpvarijante vv
where vz.id=vv.vlpzaglavlje
and vz.orgjed='01006'
and vv.skladiste='01006'
and vv.artikal in (3069,6402)
group by vz.datum,vv.artikal
order by vv.artikal,vz.datum asc)
order by artikal,datum_do asc)b
on a.dt=b.datum_do
where a.dt between to_date('12102017','ddmmyyyy') and to_date('16102017','ddmmyyyy')
order by a.dt`
and my output is like this:
and I want this:
In short, if quantity is null use lag(... ignore nulls) and coalesce or nvl:
select dt, item,
nvl(quantity, lag(quantity ignore nulls) over (partition by item order by dt))
from t
order by dt, item
Here is the full query, I cannot test it, but it is something like:
with t as (
select a.dt, b.zaliha as quantity, b.artikal as item
from (
select date '2017-10-10' + rownum - 1 dt
from dual
connect by date '2017-10-10' + rownum - 1 <= date '2017-10-16' ) a
left join (
select kolicina, datum_do, artikal,
sum(kolicina) over(partition by artikal order by datum_do) as zaliha
from (
select sum(vv.kolicinaulaz-vv.kolicinaizlaz) kolicina,
vz.datum as datum_do, vv.artikal
from vlpzaglavlja vz
join vlpvarijante vv on vz.id = vv.vlpzaglavlje
where vz.orgjed = '01006' and vv.skladiste='01006'
and vv.artikal in (3069,6402)
group by vz.datum, vv.artikal)) b
on a.dt = b.datum_do)
select *
from (
select dt, item,
nvl(quantity, lag(quantity ignore nulls)
over (partition by item order by dt)) qty
from t)
where dt >= date '2017-10-12'
order by dt, item
There are several issues in your query, major and minor:
in date generator (subquery a) you are selecting dates from long period, january to september, then joining with main tables and summing data and then selecting only small part. Why not filter dates at first?,
to_date(sysdate). sysdate is already date,
use ansi joins,
do not use order by in subqueries, it has no impact, only last ordering is important,
use date literals when defining dates, it is more readable.

How to select the last 12 months in sql?

I need to select the last 12 months. As you can see on the picture, May occurs two times.
But I only want it to occur once. And it needs to be the newest one.
Plus, the table should stay in this structure, with the latest month on the bottom.
And this is the query:
SELECT Monat2,
Monat,
CASE WHEN NPLAY_IND = '4P'
THEN 'QuadruplePlay'
WHEN NPLAY_IND = '3P'
THEN 'TriplePlay'
WHEN NPLAY_IND = '2P'
THEN 'DoublePlay'
WHEN NPLAY_IND = '1P'
THEN 'SinglePlay'
END AS Series,
Anzahl as Cnt
FROM T_Play_n
where NPLAY_IND != '0P'
order by Series asc ,Monat
This is the new query
SELECT sub.Monat2,sub.Monat,
CASE WHEN NPLAY_IND = '4P'
THEN 'QuadruplePlay'
WHEN NPLAY_IND = '3P'
THEN 'TriplePlay'
WHEN NPLAY_IND = '2P'
THEN 'DoublePlay'
WHEN NPLAY_IND = '1P'
THEN 'SinglePlay'
END
AS Series, Anzahl as Cnt FROM (SELECT ROW_NUMBER () OVER (PARTITION BY Monat2 ORDER BY Monat DESC)rn,
Monat2,
Monat,
Anzahl,
NPLAY_IND
FROM T_Play_n)sub
where sub.rn = 1
It does only show the months once but it doesn't do that for every Series.
So with every Play it should have 12 months.
In Oracle and SQL-Server you can use ROW_NUMBER.
name = month name and num = month number:
SELECT sub.name, sub.num
FROM (SELECT ROW_NUMBER () OVER (PARTITION BY name ORDER BY num DESC) rn,
name,
num
FROM tab) sub
WHERE sub.rn = 1
ORDER BY num DESC;
WITH R(N) AS
(
SELECT 0
UNION ALL
SELECT N+1
FROM R
WHERE N < 12
)
SELECT LEFT(DATENAME(MONTH,DATEADD(MONTH,-N,GETDATE())),3) AS [month]
FROM R
The With R(N) is a Common Table Expression.The R is the name of the result set (or table) that you are generating. And the N is the month number.
In SQL Server you can do It in following:
SELECT DateMonth, DateWithMonth -- Specify columns to select
FROM Tbl -- Source table
WHERE CAST(CAST(DateWithMonth AS INT) * 100 + 1 AS VARCHAR(20)) >= DATEADD(MONTH, -12,GETDATE()) -- Condition to return data for last 12 months
GROUP BY DateMonth, DateWithMonth -- Uniqueness
ORDER BY DateWithMonth -- Sorting to get latest records on the bottom
So it sounds like you want to select rows that contain the last occurrence of months. Something like this should work:
select * from [table_name]
where id in (select max(id) from [table_name] group by [month_column])
The last select in the brackets will get a list of id's for the last occurrence of each month. If the year+month column you have shown is not in descending order already, you might want to max this column instead.
You can use something like this(the table dbo.Nums contains int values from 0 to 11)
SELECT DATEADD(MONTH, DATEDIFF(MONTH, '19991201', CURRENT_TIMESTAMP) + n - 12, '19991201'),
DATENAME(MONTH,DateAdd(Month, DATEDIFF(month, '19991201', CURRENT_TIMESTAMP) + n - 12, '19991201'))
FROM dbo.Nums
I suggest to use a group by for the month name, and a max function for the numeric component. If is not numeric, use to_number().