Google BigQuery Dimensions SQL Multiple Grids - sql

how would you select multiple grids in the example below instead of just one under "WHERE":
SELECT
sdg_code,
sdg_name,
"grid.5170.3" as grid,
year,
COUNT(DISTINCT id) as pubs,
ROUND(AVG(fcr), 1) as fcr,
ROUND(EXP(AVG(LOG(GREATEST(fcr, 1)))), 1) as fcr_geomean,
ROUND(sum(AltWithScore), 1) as altmetric
FROM
(
SELECT
p.id,
year,
if(p.altmetrics.score > 0, 1, 0) as AltWithScore,
cat_sdg.code as sdg_code,
cat_sdg.name as sdg_name,
p.metrics.field_citation_ratio as fcr,
p.altmetrics.score as altmetric_score,
row_number() over(partition by p.id, cat_sdg.code) as rn
FROM
dimensions - ai.data_analytics.publications p,
UNNEST(category_sdg.full) cat_sdg
WHERE
year >= 2011
AND year <= 2020
AND "grid.5170.3" in UNNEST(research_orgs)
)
WHERE rn = 1
GROUP BY
sdg_code,
sdg_name,
year
ORDER BY year asc
What needs to be changed:
It currently only runs for 1 organisation (GRID), I would like it to run for 11 organisations.
The org is identified with an ID called a “GRID”, it looks like this:
"grid.5170.3"
I want my new code to take 10 org IDs more. Need to use these 10:
grid.5254.6, grid.7048.b, grid.5117.2, grid.10825.3e, grid.4655.2, grid.11702.35, grid.154185.c, grid.475435.4, grid.7143.1, grid.27530.33
And add a column with the org name; currently there is only a column with the org.
Thanks, new to this whole thing.

Instead of WHERE year >= 2011 AND year <= 2020 AND "grid.5170.3" in UNNEST(research_orgs) use below
WHERE year >= 2011 AND year <= 2020
AND EXISTS (
SELECT 1
FROM UNNEST(research_orgs) grid
WHERE grid IN ('grid.5170.3', 'grid.5254.6', 'grid.7048.b', 'grid.5117.2', 'grid.10825.3e', 'grid.4655.2', 'grid.11702.35', 'grid.154185.c', 'grid.475435.4', 'grid.7143.1', 'grid.27530.33')
)

Related

arithmetic operation on alias of column of table [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed last month.
Improve this question
I have written a query which fetch the data like this and I wanted to calculate the opening and maindata I have formula for then which is ( CurrentEmp + Empjoined - EmpLeft ) as opening and ((EmpLeft*100)/((CurrentEmp+opening)/2)) as maindata respectively I have written it in query but I gets the error saying invalid column name .
month year CurrentEmp join leftemp
January 2021 10 2 1
February 2021 15 3 2
March 2021 20 5 2
and the output that I expect is
month year CurrentEmp join leftemp opening
January 2021 10 2 1 11
February 2021 15 3 2 16
March 2021 20 5 2 23
I have written the below code
with t0(n) as ( select n from ( values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) t(n)),ns as(select row_number()
over(order by t1.n) - 1 n from t0 t1, t0 t2, t0 t3),calendar as (
select top(12) DATEADD(month, n, '2021-01-01' ) dt , DATEADD(month, n , '2021-01-31')dd from ns order by n) select cast(DATENAME(month, dt) as nvarchar(max)) AS month,cast(DATENAME(YEAR, dt) as nvarchar(max))AS Year,
(SELECT COUNT(*) FROM EmployeeDetail e left join Separation s on e.Id = s.EmployeeId
WHERE (e.CompanyId=1 and e.DateOfJoining <calendar.dt and e.EmpStatus = 1) or(s.CompanyId = 1 and e.DateOfJoining <calendar.dt and s.LastWorkingDate >= calendar.dt)) AS CurrentEmp,
(select count(*) from EmployeeDetail where DateOfJoining >=calendar.dt And DateOfJoining<=calendar.dd and CompanyId = 1) as Empjoined,
(select count(*) from Separation where LastWorkingDate >= calendar.dt and LastWorkingDate < =calendar.dd and CompanyId =1) as EmpLeft,
(CurrentEmp+Empjoined-EmpLeft) as opening , cast (((EmpLeft*100)/((CurrentEmp+opening)/2)) as decimal(10,2)) as maindata
from calendar order by dt
An alias in the current select statement cannot be used as a variable within that same select statement.
You'll need to either:
Repeat the calculation:
(CurrentEmp+Empjoined-EmpLeft) as opening,
-- NOTE: opening IS REPLACED BY SAME CALC: (CurrentEmp+Empjoined-EmpLeft)
cast (((EmpLeft*100) /
((CurrentEmp+(CurrentEmp+Empjoined-EmpLeft))/2))
as decimal(10,2)) as maindata
Or, move the primary calculation of opening to a sub-query and join on that.
But, given the simplicity and low overhead of the calculation, I'd just repeat the calculation.

Query to get top product gainers by sales over previous week

I have a database table with three columns.
WeekNumber, ProductName, SalesCount
Sample data is shown in below table. I want top 10 gainers(by %) for week 26 over previous week i.e. week 25. The only condition is that the product should have sales count greater than 0 in both the weeks.
In the sample data B,C,D are the common products and C has the highest % gain.
Similarly, I will need top 10 losers also.
What I have tried till now is to make a inner join and get common products between two weeks. However, I am not able to get the top gainers logic.
The output should be like
Product PercentGain
C 400%
D 12.5%
B 10%
This will give you a generic answer, not just for any particular week:
select top 10 product , gain [gain%]
from
(
SELECT product, ((curr.salescount-prev.salescount)/prev.salescount)*100 gain
from
(select weeknumber, product, salescount from tbl) prev
JOIN
(select weeknumber, product, salescount from tbl) curr
on prev.weeknumber = curr.weeknumber - 1
AND prev.product = curr.product
where prev.salescount > 0 and curr.salescount > 0
)A
order by gain desc
If you are interested in weeks 25 and 26, then just add the condition below in the WHERE clause:
and prev.weeknumber = 25
If you are using SQL-Server 2012 (or newer), you could use the lag function to match "this" weeks sales with the previous week's. From there on, it's just some math:
SELECT TOP 10 product, sales/prev_sales - 1 AS gain
FROM (SELECT product,
sales,
LAG(sales) OVER (PARTITION BY product
ORDER BY weeknumber) AS prev_sales
FROM mytable) t
WHERE weeknumber = 26 AND
sales > 0 AND
prev_sales > 0 AND
sales > prev_sales
ORDER BY sales/prev_sales
this is the Query .
select top 10 product , gain [gain%]
from
(
SELECT curr.Product, ( (curr.Sales - prev.Sales ) *100)/prev.Sales gain
from
(select weeknumber, product, sales from ProductInfo where weeknumber = 25 ) prev
JOIN
(select weeknumber, product, sales from ProductInfo where weeknumber = 26 ) curr
on prev.product = curr.product
where prev.Sales > 0 and curr.Sales > 0
)A
order by gain desc

How to select the last 12 months in sql?

I need to select the last 12 months. As you can see on the picture, May occurs two times.
But I only want it to occur once. And it needs to be the newest one.
Plus, the table should stay in this structure, with the latest month on the bottom.
And this is the query:
SELECT Monat2,
Monat,
CASE WHEN NPLAY_IND = '4P'
THEN 'QuadruplePlay'
WHEN NPLAY_IND = '3P'
THEN 'TriplePlay'
WHEN NPLAY_IND = '2P'
THEN 'DoublePlay'
WHEN NPLAY_IND = '1P'
THEN 'SinglePlay'
END AS Series,
Anzahl as Cnt
FROM T_Play_n
where NPLAY_IND != '0P'
order by Series asc ,Monat
This is the new query
SELECT sub.Monat2,sub.Monat,
CASE WHEN NPLAY_IND = '4P'
THEN 'QuadruplePlay'
WHEN NPLAY_IND = '3P'
THEN 'TriplePlay'
WHEN NPLAY_IND = '2P'
THEN 'DoublePlay'
WHEN NPLAY_IND = '1P'
THEN 'SinglePlay'
END
AS Series, Anzahl as Cnt FROM (SELECT ROW_NUMBER () OVER (PARTITION BY Monat2 ORDER BY Monat DESC)rn,
Monat2,
Monat,
Anzahl,
NPLAY_IND
FROM T_Play_n)sub
where sub.rn = 1
It does only show the months once but it doesn't do that for every Series.
So with every Play it should have 12 months.
In Oracle and SQL-Server you can use ROW_NUMBER.
name = month name and num = month number:
SELECT sub.name, sub.num
FROM (SELECT ROW_NUMBER () OVER (PARTITION BY name ORDER BY num DESC) rn,
name,
num
FROM tab) sub
WHERE sub.rn = 1
ORDER BY num DESC;
WITH R(N) AS
(
SELECT 0
UNION ALL
SELECT N+1
FROM R
WHERE N < 12
)
SELECT LEFT(DATENAME(MONTH,DATEADD(MONTH,-N,GETDATE())),3) AS [month]
FROM R
The With R(N) is a Common Table Expression.The R is the name of the result set (or table) that you are generating. And the N is the month number.
In SQL Server you can do It in following:
SELECT DateMonth, DateWithMonth -- Specify columns to select
FROM Tbl -- Source table
WHERE CAST(CAST(DateWithMonth AS INT) * 100 + 1 AS VARCHAR(20)) >= DATEADD(MONTH, -12,GETDATE()) -- Condition to return data for last 12 months
GROUP BY DateMonth, DateWithMonth -- Uniqueness
ORDER BY DateWithMonth -- Sorting to get latest records on the bottom
So it sounds like you want to select rows that contain the last occurrence of months. Something like this should work:
select * from [table_name]
where id in (select max(id) from [table_name] group by [month_column])
The last select in the brackets will get a list of id's for the last occurrence of each month. If the year+month column you have shown is not in descending order already, you might want to max this column instead.
You can use something like this(the table dbo.Nums contains int values from 0 to 11)
SELECT DATEADD(MONTH, DATEDIFF(MONTH, '19991201', CURRENT_TIMESTAMP) + n - 12, '19991201'),
DATENAME(MONTH,DateAdd(Month, DATEDIFF(month, '19991201', CURRENT_TIMESTAMP) + n - 12, '19991201'))
FROM dbo.Nums
I suggest to use a group by for the month name, and a max function for the numeric component. If is not numeric, use to_number().

How use the operator IN with a subquery that returns two columns

Hello masters I need your help.
Having the table:
DataCollection
==================
PK Code
smallint RestaurantCode
smallint Year
tinyint Month
money Amount
money AccumulativeMonthsAmount
...
I need the AccumulateAmount for the LastMonth on every Restaurant.
First, I get the last Month per Restaurant for the 'current year'(for this case):
SELECT RestaurantCode, MAX(Month) as Month FROM DataCollection
WHERE (Year >= 2012 AND YEAR <= 2012) GROUP BY RestaurantCode
Now I want to use that as subquery, to get the Last - AccumulativeMonthsAmount :
SELECT AccumulativeMonthsAmount FROM DataCollection
WHERE (RestaurantCode, Month)
IN (SELECT RestaurantCode, MAX(Month) as Month FROM DataCollection
WHERE (Year >= 2012 AND YEAR <= 2012) GROUP BY RestaurantCode)
But the operator IN, don't work, How I should do it?
Sample Data sorted by Year and Month:
RestCode Amount Accumulative Year Month
123 343.3 345453.65 2012 12
123 124.7 345329.00 2012 11
...
122 312.2 764545.00 2012 12
122 123.4 764233.00 2012 11
...
999 500.98 2500.98 2012 6
999 100.59 2000.00 2012 5
...
I wanna to get the Accumulative for the last month of every restaurant:
RestCode Accumulative Month
123 345453.65 12
122 764545.00 12
99 2500.98 6
...
SELECT dc.AccumulativeMonthsAmount
FROM dbo.DataCollection AS dc
INNER JOIN
(
SELECT RestaurantCode, MAX(Month)
FROM dbo.PAL_Entries_Relatives
WHERE [Year] = 2012
GROUP BY RestaurantCode
) AS r(rc, m)
ON dc.RestaurantCode = r.rc
AND dc.[Month] = r.m;
With the changed requirements:
;WITH x AS
(
SELECT RestCode, Accumulative, [Month],
rn = ROW_NUMBER() OVER (PARTITION BY RestCode ORDER BY [Month] DESC)
FROM dbo.DataCollection -- or is it dbo.PAL_Entries_Relatives?
)
SELECT RestCode, Accumulative, [Month]
FROM x
WHERE rn = 1
ORDER BY [Month] DESC, RestCode DESC;
That syntax is not allowed in SQL Server. You can do something similar with EXISTS:
SELECT AccumulativeMonthsAmount
FROM DataCollection dc
WHERE exists (select 1
from PAL_Entries_Relatives er
where (Year >= 2012 AND YEAR <= 2012)
group by RestaurantCode
having er.RestaurantCode = dc.RestaurantCode and
max(er.month) = dc.Month
)
SELECT AccumulativeMonthsAmount
FROM DataCollection
INNER JOIN PAL_Entries_Relatives
ON DataCollection.RestaurantCode = PAL_Entries_Relatives.RestaurantCode
WHERE (Year >= 2012 AND YEAR <= 2012)
GROUP BY DataCollection.RestaurantCode
HAVING AccumulativeMonthsAmount.Month = MAX(PAL_Entries_Relatives.Month)

MDX Count over time (years - not within a year)

I'd like to be able to rollup the count of commitments to a product over years -
The data for new commitments in each year looks like this:
Year | Count of new commitments | (What I'd like - count of new commitments to date)
1986 4 4
1987 22 26
1988 14 40
1989 1 41
I know that within a year you can do year to date, month to date etc, but I need to do it over multiple years.
the mdx that gives me the first 2 columns is (really simple - but I don't know where to go from here):
select [Measures].[Commitment Count] on 0
, [Date Dim].[CY Hierarchy].[Calendar Year] on 1
from [Cube]
Any help would be great
In MDX something along the line:
with member [x] as sum(
[Date Dim].[CY Hierarchy].[Calendar Year].members(0) : [Date Dim].[CY Hierarchy].currentMember,
[Measures].[Commitment Count]
)
select [x] on 0, [Date Dim].[CY Hierarchy].[Calendar Year] on 1 from [Cube]
Use a common table expression:
with sums (year,sumThisYear,cumulativeSum)
as (
select year
, sum(commitments) as sumThisYear
, sum(commitments) as cumulativeSum
from theTable
where year = (select min(year) from theTable)
group by year
union all
select child.year
, sum(child.commitments) as sumThisYear
, sum(child.commitments) + parent.cumulativeSum as cumulativeSum
from sums par
JOIN thetable Child on par.year = child.year - 1
group by child.year,parent.cumulativeSum
)
select * from sums
There's a bit of a "trick" in there grouping on parent.cumulativeSum. We know that this will be the same value for all rows, and we need to add it to sum(child.commitments), so we group on it so SQL Server will let us refer to it. That can probably be cleaned up to remove what might be called a "smell", but it will work.
Warning: 11:15pm where I am, written off the top of my head, may need a tweak or two.
EDIT: forgot the group by in the anchor clause, added that in