How assing new column value for only specific one - google-bigquery

I have a bit complex query to do in BigQuery.
Please find below example.
I have :
Team
Team type
Team 1
B1
Team 1
B2
Team 2
B3
Team 2
B4
Team 3
B5
Team 3
B6
I need to have
Team
Teamtype
Team_model
Team 1
B1
Team x
Team 1
B2
Team 1
Team 2
B3
Team 2
Team 2
B4
Team 2
Team 3
B5
Team 3
Team 3
B6
Team 3
In similar logic like this
CASE
WHEN teamtype = A1 THEN "Team x"
ELSE team_model = %team%
END
AS team_model
I know that in biq query you cant do this way. Would anyone know please how to perform this type of action?
Thank you!
Regards,
Martin

You were on the right track. You may use this version:
SELECT Team, Teamtype, CASE Teamtype WHEN 'B1' THEN 'Team x' ELSE Team END AS Team_model
FROM yourTable
ORDER BY Team, Teamtype;

Related

How do i keep a row connected together(moving together) while shifting/adding row above it?

I have an Excel file containing two sheets:
Teams (which contains details of teams and time schedule of their matches)
Results (contains calculation and number of matches and results, etc).
WhatResultsdid is that each game that has launched status on Teams it gets on Results sheet and thenResultsinput the number and calculation on that sheet.
My problem is that when there is a game inside two started games, when it starts the new name comes and shifts the name row below it and keeps the details of the row below it for itself, while it should be to next of it, here is my example:
Sheet1 Teams
A B
1 **Names** **Status**
2 TEAM A Launched
3 TEAM B Pending
4 TEAM C Pending
5 TEAM D Launched
2 Results
A B C D E
1 **Names** **1st Half goals** **2nd half** **total** **points**
2 TEAM A 1 2 3 13
3 TEAM D 3 1 4 10
So what happens here is that if change the status of TEAM B on Sheet1 of Teams it will be appear on 2 Results and it will take the row input of TEAM D.
It will be like this:
2 Results
A B C D E
1 **Names** **1st Half goals** **2nd half** **total** **points**
2 TEAM A 1 2 3 13
3 TEAM B 3 1 4 10
4 TEAM D
Is there any solution? please let me know.
This is the formula on Sheet 2 Results Cell A2
=IFERROR(INDEX(TEAMS!A$2:A$550,SMALL(IF(TEAMS!B$2:B$550="Launched",ROW(TEAMS!B$2:B$550)-ROW(TEAMS!B$1)),ROW(TEAMS!B2))),"")
and Cell B & C & E there is no formula only manual input, and for Cell D =SUM(C2,B2)
I found an answer for my question, thanks to GraH - Guido & Vletm users on Chandoo Forum, who helped me find the answer.
I found 2 answers both helpful and working for my case, you can find them here and here
Thanks.

Python pandas: delete the data in a data frame that the size of data is below a value

I have a data frame called df(this is just example, the real data is big, please consider the computing speed) as following:
name id text
tom 1 a1
lucy 2 b1
john 3 c1
tick 4 d1
tom 1 a2
lucy 2 b2
john 3 c2
tick 4 d2
tom 1 a3
lucy 2 b3
john 3 c3
tick 4 d3
tom 1 a4
tick 4 d4
tom 1 a5
lucy 2 b5
tick 4 d5
the dataframe can be grouped by the name(tom, john, lucy, tick). I want to delete the data that the size of each group(by name)is less 5. I mean since the size of name of lucy and john is less 5, I want to delete these data and get the new df(just have tick and tom data), such as.
Could you tell me how to do it,please! Thanks!
I think you can use a filter for this. It would only be one line:
df = pd.DataFrame({'name': ['tom','lucy','john','tick','tom','lucy','john','tick', 'tom', 'lucy','john','tick','tom','tick','tom', 'lucy','tick'], 'id':[1,2,3,4,1,2,3,4,1,2,3,4,1,4,1,2,4],'text':['a1','b1','c1','d1','a2','b2','c2','d2','a3','b3','c3','d3','a4','d4','a5','b5','d5']})
df.groupby('name').filter(lambda x: len(x) >= 5)
and the output is only Tick and Tom:
id name text
0 1 tom a1
3 4 tick d1
4 1 tom a2
7 4 tick d2
8 1 tom a3
11 4 tick d3
12 1 tom a4
13 4 tick d4
14 1 tom a5
16 4 tick d5
You can use value_counts(), then, if you want to, you can reset the index reset_index()
s = df.name.value_counts()
print(df[df.name.isin(s[s > 4].index)].reset_index(drop=True))
name id text
0 tom 1 a1
1 tick 4 d1
2 tom 1 a2
3 tick 4 d2
4 tom 1 a3
5 tick 4 d3
6 tom 1 a4
7 tick 4 d4
8 tom 1 a5
9 tick 4 d5

How to pivot output in SQL Server 2012

Input:
Table1->Student details
sno sname scourse sjoindate
1 Ram bsc 11/26/2011
2 Sham bcom 10/06/2010
3 Krishna ba 06/16/2012
Table2->Student marks
sno id marks
1 1 67
1 2 77
1 3 80
1 4 60
1 5 90
Table3->Subjectnames
id subjectname
1 Computerscience
2 Maths
3 Satatistics
4 English
5 Hindi
Table4->Student_feedback_QuestionandAnswer
sno Questions Answers
1 Q1 A1
1 Q2 A2
1 Q3 A3
1 Q4 A4
Expected Output:
sno sname scourse sjoindate Computerscience Maths Statistics English Hindi Questions Answers
1 Ram bsc 11/26/2011 67 77 80 60 90 Q1 A1
Q2 A2
Q3 A3
Q4 A4
In the above tables the rows of Table2 and Table3 can increase and decrease depending on the studentname.
I need tabled to combine the data from all four tables into one result set but I don't know how to use pivot with dynamic header with dynamic column values:
Data from Table3.Subjectname column values should be header column names e.g. Computerscience, Hindi.
Data from Table2.marks should be the value under the header column name e.g. marks for id=1 under Computerscience.
the Questions, Answers column from Table4 should come in result set depending on the value of `Table1.sno'
If you can provide a hint for creating a table using pivot with dynamic header with dynamic column values ,yeah using two tables.
Please let me know for more clarification.

pig order by with rank and join the rank together

I have the following data with the schema (t0:chararray, t1:int)
a0 1
a1 7
b2 9
a2 4
b0 6
And I want to order it t1 and then combine with a rank
a0 1 1
a2 4 2
b0 6 3
a1 7 4
b2 9 5
Is there any convenient way without writing UDF in pig?
There is the RANK operation in Pig. This should be sufficient:
X = rank A by t1 ASC;
Please see the Pig docs for more details.

How to get the last non empty value of a hierarchy?

I've got a hierarchy with the appropriate value linked to each level, let's say :
A 100
A1 NULL
A2 NULL
B
B1 NULL
B2 1000
B21 500
B22 500
B3 NULL
This hierarchy is materialized in my database as a parent-child hierarchy
Hierarchy Table
------------------------
Id Code Parent_Id
1 A NULL
2 A1 1
3 A2 3
4 B NULL
5 B1 4
6 B2 4
7 B21 6
8 B22 6
9 B3 4
And here is my fact table :
Fact Table
------------------------
Hierarchy_Id Value
1 100
6 1000
7 500
8 500
My question is : do you know/have any idea of how to get only the last non empty value of my hiearchy?
I know that there an MDX function which could do this job but I'd like to do this in an another way.
To be clear, the desired output would be :
Fact Table
------------------------
Hierarchy_Id Value
1 100
7 500
8 500
(If necessary, the work of flatten the hierarchy is already done...)
Thank you in advance!
If the codes for your hierarchy are correct, then you can use the information in the codes to determine the depth of the hierarchy. I think you want to filter out any "code" where there is a longer code that starts with it.
In that case:
select f.*
from fact f join
hierarchy h
on f.hierarchyId = h.hierarchyId
where not exists (select 1
from fact f2 join
hierarchy h2
on f2.hierarchyId = h2.hierarchyId
where h2.code like concat(h.code, '%') and
h2.code <> h.code
)
Here I've used the function concat() to create the pattern. In some databases, you might use + or || instead.