Mapping based on condition in TRUE/FALSE big query - google-bigquery

I have table:
name type order
Hokben 6_image-Resto Siap Santap 3
Hokben Hokben 2
Hokben home_icon-Terdekat 4
Hokben home_icon-Terlaris 2
Jelly Jelly 2
Jelly home_icon-Terlaris 1
Aqua Resto 3
Aqua home_icon-Terdekat 5
I Want to know that home_icon-Terdekat are the highest order or not based on name. The result like this:
name type order result
Hokben 6_image-Resto Siap Santap 3 FALSE
Hokben Hokben 2 FALSE
Hokben home_icon-Terdekat 4 TRUE
Hokben home_icon-Terlaris 2 FALSE
Jelly Jelly 2 FALSE
Jelly home_icon-Terdekat 1 FALSE
Aqua Resto 3 FALSE
Aqua home_icon-Terdekat 4 TRUE

Use below
select *, `order` = max(`order`) over() as result
from your_table
if applied to sample data in your question - output is

select t1.name,
t1.type,
t1.order,
case when t1.type = 'home_icon-Terdekat' and t1.order = (max(t1.order) over(partition by t1.name)) then TRUE else FALSE end as result,
from YOUR_TABLE t1

Related

How to count if there does not exist TRUE in the same category?

Assume I have two tables:
cameraNum
roadNum
isWorking
100
1
TRUE
101
1
FALSE
102
1
TRUE
103
3
FALSE
104
3
FALSE
105
7
TRUE
106
7
TRUE
107
7
TRUE
108
9
FALSE
109
9
FALSE
110
9
FALSE
roadNum
length
1
90
3
140
7
110
9
209
I want to select a table like this:
If there is no camera working, I put it in the table.
roadNum
length
3
140
9
209
I tried this below:
SELECT r.roadNum, r.length
FROM Cameras c, Road r
WHERE c.isWorking = FALSE
AND h.highwayNum = c.highwayNum
But these code only fliter there exists FALSE in isWorking.
roadNum
length
1
90
3
140
9
209
You want roads whose all cameras are not working. Here is one way to do it with aggregation and having:
select r.*
from road r
inner join camera c on c.roadNum = r.roadNum
group by r.roadNum
having not bool_or(isWorking)
Demo on DB Fiddle
roadnum
length
3
140
9
209
Regarding using not exists, yes, you can use it. The following uses a CTE to get only the roadnum of those satisfying the camera requirement then joins that to road: (see demo)
with no_working_caMera (roadnum) as
( select distinct on (c1.roadNum)
c1.roadnum
from cameras c1
where not c1.isworking
and not exists (select null
from cameras c2
where c2.roadNum = c1.roadNum
and c2.isworking
)
order by c1.roadnum
)
select r.*
from no_working_camera nwc
join road r
on nwc.roadnum = r.roadnum;
Don't join, don't aggregate, just use NOT IN or NOT EXISTS in order to find roads that don't have a working camera:
select *
from road
where roadnum not in (select roadnum from cameras where isworking);

Subquery or CTE to identify the mix of area in an extra column

I have the following table for which I am looking to create a new column, type which can either be "pure" or "mix" based on two different conditions.
id
unit
area
n_unit
qty
1245
5485245
A
2
1
1245
2488754
B
2
1
2358
548754
A
3
1
2358
84447
A
3
1
2358
548754
A
3
1
4582
84447
C
2
1
4582
548754
D
2
1
9696
84447
B
2
1
9696
548754
K
2
1
I am looking to have a result as below:
id
unit
area
n_unit
qty
type
1245
5485245
A
2
1
mix
1245
2488754
B
2
1
mix
2358
548754
A
3
1
pure
2358
84447
A
3
1
pure
2358
548754
A
3
1
pure
4582
84447
C
2
1
pure
4582
548754
D
2
1
pure
9696
84447
B
2
1
mix
9696
548754
K
2
1
mix
My logic is this:
If all the rows with the same Id are either Area A, C or D then all rows with that Id are type "pure".
Otherwise, i.e. if a letter which is not A, C or D exists within the Id, all rows with the same Id are type "mix".
The n_units is based on the total units i.e. the number of rows with the same Id.
Looking forward to your kind help.
It requires you one window function and one case expression as follows:
SELECT *, MIN(CASE WHEN area IN ('A', 'C', 'D')
THEN 'pure'
ELSE 'mix' END) OVER(PARTITION BY id) AS type
FROM tab
If there's a 'mix' in your output, it will become the minimum value to be assigned to the partition, otherwise you will get 'pure'.
Check the demo here.

Order by descending aggregation within window function in PostgreSQL

I have a dataset that features duplicate values of the the primary variable, something like the following:
col1 col2 counts
110 False 1
111 False 2
111 False 1
112 True 3
112 False 2
112 False 1
113 False 1
114 False 1
115 False 2
115 False 1
116 False 1
117 False 1
118 False 4
118 False 3
118 False 2
118 False 1
I have achived this by using the following code
SELECT DISTINCT ctm_nbr
,col1
,col2
,RANK () OVER (PARTITION BY col1 ORDER BY col2) AS counts
FROM my_table
GROUP BY 1,2,3
ORDER BY ctm_nbr, row_numb DESC
However, my desired output needs to be ordered such that counts is descending yet col1 remains partitioned, so that I can see, for example, which value from col1 has the highest number of counts. Like this...
col1 col2 counts
118 False 4
118 False 3
118 False 2
118 False 1
112 True 3
112 False 2
112 False 1
115 False 2
115 False 1
111 False 2
111 False 1
110 False 1
113 False 1
114 False 1
116 False 1
117 False 1
I have tried various iterations of the final ORDER BY clause but just can't quite produce the output I need. Guidance appreciated.
You can use window functions in the order by. I think you just want:
ORDER BY COUNT(*) OVER (PARTITION BY ctm_nbr) DESC,
ctm_nbr,
row_numb DESC
This assumes that the count is the maximum value of row_numb(). So you can also express this as:
ORDER BY MAX(row_numb) OVER (PARTITION BY ctm_nbr) DESC,
ctm_nbr,
row_numb DESC

SQL or R delete rows and separate by type

I have the following issue in SQL, but would gladly like to find out how to solve this issue in R language.
I have 2 different tables:
Table 1 - annonymizedData
userID AssigID Score Time on Task
12345 10001 4 60
12346 10001 5 70
12567 10003 9 80
12789 10003 8 67
12903 10004 7 73
Table 2 Anonymized users
userID Teacher
12345 False
12346 False
12567 False
12789 False
12903 True
Table 3 Assignments
AssigID type
10001 1
10001 1
10003 2
10003 2
10004 3
What I am trying to do is:
Delete the rows from Table 1 where users are teachers,so if the value teacher from table two is true based on userID I want to get rid of those users in table 1.
Build a query to see the data from the table 1 where assignments are type 1, so I somehow need to connect the type value from the table 3?
Or perhaps if someone knows how to do this in R would be great too.
Any help would be much appreciated!
Solution for first point:
delete Table1 from Table1
join Table2 on
Table1.userID = Table2.userID and
Table2.Teacher = true
Solution for Point 2
select table1.* from table1
join table3 on
table1.AssigID = table3.AssigID and
table3.type = 1
I am interpreting the question as simply the delete, because that is what is in the title.
You can easily do the delete in SQL. The exact syntax might vary a bit by database, but it looks something like this:
delete data d
where d.userID = (select u.userId from users u where u.teacher = 'true');
Using dplyr in R, join all the tables together and filter out the teachers:
table1 %>%
inner_join(table2, by = "userID") %>%
inner_join(table3, by = "AssigID") %>%
filter(Teacher == "False")
userID AssigID Score Time_on_Task Teacher type
1 12345 10001 4 60 False 1
2 12345 10001 4 60 False 1
3 12346 10001 5 70 False 1
4 12346 10001 5 70 False 1
5 12567 10003 9 80 False 2
6 12567 10003 9 80 False 2
7 12789 10003 8 67 False 2
8 12789 10003 8 67 False 2

How to Update column Num with an incremental number in Master-Detail with row_number()?

I want to update my column Acc.DocHeader.Num and Acc.DocItem.Num with an incremental number. I have:
UPDATE x
SET x.Num = x.newNum,x.iNum=x.newNum
FROM (
SELECT Num,iNum, ROW_NUMBER() OVER (ORDER BY DocCreateDate ,DailyNum) AS newNum
FROM (SELECT h.Num,h.DocCreateDate,h.DailyNum,i.Num iNum FROM Acc.DocHeader h INNER JOIN Acc.DocItem i ON i.DocHeaderRef = h.Id WHERE h.Year = 1395 AND h.BranchRef = 1) AS header
) x
Why do I get Derived table 'x' is not updatable because the modification affects multiple base tables?
DocHeader Table :
Id Num Year DocCreateDate
-------------------------------------------------------
1 NULL 1396 2016-03-20
2 NULL 1395 2016-04-02
3 NULL 1395 2016-04-05
4 NULL 1395 2016-04-10
DocItem Table:
Id Num DocHeaderRef
----------------------------------------------
1 NULL 1
2 NULL 1
3 NULL 1
4 NULL 4
5 NULL 4
6 NULL 3
7 NULL 3
8 NULL 3
output:
DocHeader Table:
Id Num Year DocCreateDate
-------------------------------------------------------
1 1 1396 2016-03-20
2 1 1395 2016-04-02
3 2 1395 2016-04-05
4 3 1395 2016-04-10
DocItem Table:
Id Num DocHeaderRef
----------------------------------------------
1 1 1
2 1 1
3 1 1
4 3 4
5 3 4
6 4 3
7 4 3
8 4 3
You are attempting to update columns from two different tables in a single update statement:
Num comes from Acc.DocHeader
iNum comes from Acc.DocItem
In SQL Server, you can only update one table at a time in an UPDATE.
You can update multiple tables in a single transaction. You can also use the OUTPUT clause to capture the values from the rows being updated. This answers the question of why you cannot do what you want.
I find your query a bit hard to follow -- and your question doesn't explain what you are trying to do -- so it is hard to suggest alternatives.