add Rank Column to pandas Dataframe based on column condition - pandas

What I have is below.
DOG
Date
Steps
Tiger
2021-11-01
164
Oakley
2021-11-01
76
Piper
2021-11-01
65
Millie
2021-11-01
188
Oscar
2021-11-02
152
Foster
2021-11-02
191
Zeus
2021-11-02
101
Benji
2021-11-02
94
Lucy
2021-11-02
186
Rufus
2021-11-02
65
Hank
2021-11-03
98
Olive
2021-11-03
122
Ellie
2021-11-03
153
Thor
2021-11-03
152
Nala
2021-11-03
181
Mia
2021-11-03
48
Bella
2021-11-03
23
Izzy
2021-11-03
135
Pepper
2021-11-03
22
Diesel
2021-11-04
111
Dixie
2021-11-04
34
Emma
2021-11-04
56
Abbie
2021-11-04
32
Guinness
2021-11-04
166
Kobe
2021-11-04
71
What I want is below. Rank by value of ['Steps'] column for each Date
DOG
Date
Steps
Rank
Tiger
2021-11-01
164
2
Oakley
2021-11-01
76
3
Piper
2021-11-01
65
4
Millie
2021-11-01
188
1
Oscar
2021-11-02
152
3
Foster
2021-11-02
191
1
Zeus
2021-11-02
101
4
Benji
2021-11-02
94
5
Lucy
2021-11-02
186
2
Rufus
2021-11-02
65
6
Hank
2021-11-03
98
6
Olive
2021-11-03
122
5
Ellie
2021-11-03
153
2
Thor
2021-11-03
152
3
Nala
2021-11-03
181
1
Mia
2021-11-03
48
7
Bella
2021-11-03
23
8
Izzy
2021-11-03
135
4
Pepper
2021-11-03
22
9
Diesel
2021-11-04
111
2
Dixie
2021-11-04
34
5
Emma
2021-11-04
56
4
Abbie
2021-11-04
32
6
Guinness
2021-11-04
166
1
Kobe
2021-11-04
71
3
I tried below, but it failed.
df['Rank'] = df.groupby('Date')['Steps'].rank(ascending=False)

First your solution for me working.
Maybe need method='dense' and casting to integers.
df['Rank'] = df.groupby('Date')['Steps'].rank(ascending=False, method='dense').astype(int)

Related

Find Maximum Value in Column Pandas

I have a data frame like this- Machine Vibration data.
datetime
tagid
value
quality
0
2021-03-01 13:43:41.440
B42
345
192
1
2021-03-01 13:43:41.440
B43
958
192
2
2021-03-01 13:43:41.440
B44
993
192
3
2021-03-01 13:43:41.440
B45
1224
192
4
2021-03-01 13:43:43.527
B188
6665
192
5
2021-03-01 13:43:43.527
B189
7162
192
6
2021-03-01 13:43:43.527
B190
7193
192
7
2021-03-01 13:43:43.747
C29
2975
192
8
2021-03-01 13:43:43.747
C30
4445
192
9
2021-03-01 13:43:43.747
C31
4015
192
I want to convert this to hourly maximum value for each tag id.
Sample Output
datetime
tagid
value
quality
01-03-2021 13:00
C91
3982
192
01-03-2021 14:00
C91
3972
192
01-03-2021 13:00
C92
9000
192
01-03-2021 14:00
C92
9972
192
01-03-2021 13:00
B42
396
192
01-03-2021 14:00
B42
370
192
01-03-2021 15:00
B42
370
192
I tried with grouper, but couldn't get output.
Use Grouper with aggregate max:
df = df.groupby([pd.Grouper(freq='H', key='datetime'), 'tagid']).max().reset_index()

Oracle SQL Flight Database - Find flight from A to B with stopover?

I have a flight Database with the following table:
FID from fto dep arrive days flightno
---- ----- --- ---- ------ ---- --------
167 MPB KYM 1020 1040 0 EA5203
168 MPB KYM 1510 1530 0 EA5205
169 MPB KYM 1705 1725 0 EA5207
221 NEB KYM 850 1025 0 EA782
222 NEB KYM 1355 1530 0 EA784
223 NEB KYM 1810 1945 0 EA786
557 BAH NEB 1305 1500 0 EA686
558 BAH ELM 605 715 0 EA162
559 BAH ELM 1005 1115 0 EA340
560 BAH ELM 1230 1340 0 EA872
561 BAH ELM 1325 1435 0 EA442
562 BAH ELM 1400 1510 0 EA872
563 BAH ELM 1455 1605 0 EA978
564 BAH ELM 1640 1750 0 EA640
565 BAH ELM 2025 2135 0 EA940
566 BAH YDS 645 845 0 EA992
567 BAH YDS 945 1130 0 EA974
1163 PPP KYM 1040 1110 0 EA3201
1164 PPP KYM 1450 1520 0 EA3207
1190 OKR KYM 825 920 0 EA3200
1191 OKR KYM 1010 1100 0 EA3204
1192 OKR KYM 1500 1605 0 EA3214
1517 SVT KYM 810 920 0 EA3201
1518 SVT KYM 940 1050 0 EA3201
1519 SVT KYM 1215 1310 0 EA3211
1520 SVT KYM 1510 1605 0 EA3211
How do I query it to show indirect flights from BAH to KYM?
I've tried a number of ways to no avail. Any help is greatly appreciated.
select CONNECT_BY_ROOT ffrom ||SYS_CONNECT_BY_PATH(fto, '/') path_ ,level,ffrom,fto
from FLIGHTS_TABLE
where level > 1
start with ffrom='BAH' CONNECT BY PRIOR fto =ffrom ORDER SIBLINGS By ffrom ;
It retuns :
PATH_ |LEVEL| FFROM | FTO
BAH/NEB/KYM | 2 | NEB | KYM
BAH/NEB/KYM | 2 | NEB | KYM
BAH/NEB/KYM | 2 | NEB | KYM
It returns 3 rows because of there are 3 row/flight from 'NEB' to 'KYM'. I don't know which flight is indirect flight.

Find frequency of data in sql server 2014 by date and time

so here is a question.
I have a table FacebookInfo with a column name.
Another table is FacebookPost with a column created_time and foriegn key as facebookinfoid mapped to FacebookInfo Id column
So basically FacebookInfo has a record of Facebook Pages and FacebookPost is the posts of those facebook pages.
What I want to find out is how frequently the pages are posting on Facebook, so I want to find out the average posts per day, the difference in hours between those posts, average time on first post of a day and average time on last post of the day.
Thanks for help.
Here is some sample data
FacebookInfo
id name
3 Qatar Airways
4 KLM Royal Dutch Airlines
5 LATAM Airlines
6 Southwest Airlines
FacebookPost
id facebookinfoid created_time
777 3 2016-12-06 12:54:31.000
778 3 2016-12-05 09:54:09.000
779 3 2016-12-02 12:40:46.000
780 3 2016-12-01 13:00:00.000
781 3 2016-11-30 11:29:53.000
782 3 2016-11-30 09:00:00.000
783 3 2016-11-29 10:09:45.000
784 3 2016-11-28 14:00:00.000
785 3 2016-11-27 11:21:11.000
786 3 2016-11-26 12:00:01.000
787 3 2016-11-25 11:58:55.000
788 3 2016-11-24 10:28:19.000
789 3 2016-11-23 16:20:29.000
790 3 2016-11-23 11:19:42.000
791 3 2016-11-21 12:03:07.000
792 3 2016-11-18 13:36:41.000
793 3 2016-11-17 11:08:41.000
794 3 2016-11-16 12:01:00.000
795 3 2016-11-15 13:39:06.000
796 3 2016-11-11 15:11:56.000
1454 4 2016-12-06 15:00:22.000
1455 4 2016-12-05 14:59:04.000
1456 4 2016-12-05 09:00:07.000
1457 4 2016-12-04 15:00:07.000
1458 4 2016-12-03 10:00:08.000
1459 4 2016-12-02 15:00:15.000
1460 4 2016-12-01 14:00:00.000
1461 4 2016-11-30 13:30:24.000
1462 4 2016-11-29 15:00:07.000
1463 4 2016-11-28 15:00:19.000
1464 4 2016-11-28 09:00:09.000
1465 4 2016-11-26 10:00:06.000
1466 4 2016-11-24 15:00:04.000
1467 4 2016-11-23 09:00:09.000
1468 4 2016-11-22 15:01:04.000
1469 4 2016-11-21 15:00:07.000
1470 4 2016-11-21 05:00:10.000
1471 4 2016-11-19 10:00:07.000
1472 4 2016-11-18 09:00:10.000
1473 4 2016-11-17 15:00:01.000
2454 5 2016-12-05 16:00:01.000
2455 5 2016-12-02 16:02:37.000
2456 5 2016-12-01 16:00:09.000
2457 5 2016-11-30 16:00:48.000
2458 5 2016-11-29 16:01:34.000
2459 5 2016-11-28 16:00:00.000
2460 5 2016-11-25 16:00:01.000
2461 5 2016-11-23 16:00:00.000
2462 5 2016-11-22 16:00:00.000
2463 5 2016-11-21 16:00:00.000
2464 5 2016-11-19 16:00:03.000
2465 5 2016-11-18 16:00:00.000
2466 5 2016-11-17 16:00:01.000
2467 5 2016-11-16 16:00:03.000
2468 5 2016-11-15 16:00:01.000
2469 5 2016-11-12 16:00:00.000
2470 5 2016-11-11 16:00:00.000
2471 5 2016-11-10 16:00:01.000
2472 5 2016-11-09 16:00:00.000
2473 5 2016-11-08 16:00:02.000
3059 6 2016-12-06 15:14:30.000
3060 6 2016-12-04 21:38:33.000
3061 6 2016-12-03 22:27:40.000
3062 6 2016-12-02 21:29:42.000
3063 6 2016-12-01 23:00:04.000
3064 6 2016-11-30 22:00:02.000
3065 6 2016-11-30 20:28:17.000
3066 6 2016-11-29 17:57:02.000
3067 6 2016-11-28 20:49:59.000
3068 6 2016-11-26 17:10:55.000
3069 6 2016-11-26 12:50:45.000
3070 6 2016-11-25 21:16:31.000
3071 6 2016-11-25 01:27:09.000
3072 6 2016-11-24 15:50:16.000
3073 6 2016-11-23 22:00:01.000
3074 6 2016-11-23 15:10:32.000
3075 6 2016-11-22 21:42:42.000
3076 6 2016-11-22 16:29:28.000
3077 6 2016-11-22 03:03:21.000
3078 6 2016-11-22 01:45:41.000

SQL server select from 3 tables

I have three tables in my database Books, Borrowers and Movement:
Books
BookID Title Author Category Published
----------- ------------------------------ ------------------------- --------------- ----------
101 Ulysses James Joyce Fiction 1922-06-16
102 Huckleberry Finn Mark Twain Fiction 1884-03-24
103 The Great Gatsby F. Scott Fitzgerald Fiction 1925-06-17
104 1984 George Orwell Fiction 1949-04-19
105 War and Peace Leo Tolstoy Fiction 1869-08-01
106 Gullivers Travels Jonathan Swift Fiction 1726-07-01
107 Moby Dick Herman Melville Fiction 1851-08-01
108 Pride and Prejudice Jane Austen Fiction 1813-08-13
110 The Second World War Winston Churchill NonFiction 1953-09-01
111 Relativity Albert Einstein NonFiction 1917-01-09
112 The Right Stuff Tom Wolfe NonFiction 1979-09-07
121 Hitchhikers Guide to Galaxy Douglas Adams Humour 1975-10-27
122 Dad Is Fat Jim Gaffigan Humour 2013-03-01
131 Kick-Ass 2 Mark Millar Comic 2012-03-03
133 Beautiful Creatures: The Manga Kami Garcia Comic 2014-07-01
Borrowers
BorrowerID Name Birthday
----------- ------------------------- ----------
2 Bugs Bunny 1938-09-08
3 Homer Simpson 1992-09-09
5 Mickey Mouse 1928-02-08
7 Fred Flintstone 1960-06-09
11 Charlie Brown 1965-06-05
13 Popeye 1933-03-03
17 Donald Duck 1937-07-27
19 Mr. Magoo 1949-09-14
23 George Jetson 1948-04-08
29 SpongeBob SquarePants 1984-08-04
31 Stewie Griffin 1971-11-17
Movement
MoveID BookID BorrowerID DateOut DateIn ReturnCondition
----------- ----------- ----------- ---------- ---------- ---------------
1 131 31 2012-06-01 2013-05-24 good
2 101 23 2012-02-10 2012-03-24 good
3 102 29 2012-02-01 2012-04-01 good
4 105 7 2012-03-23 2012-05-11 good
5 103 7 2012-03-22 2012-04-22 good
6 108 7 2012-01-23 2012-02-12 good
7 112 19 2012-01-12 2012-02-10 good
8 122 11 2012-04-14 2013-05-01 poor
9 106 17 2013-01-24 2013-02-01 good
10 104 2 2013-02-24 2013-03-10 bitten
11 121 3 2013-03-01 2013-04-01 good
12 131 19 2013-04-11 2013-05-23 good
13 111 5 2013-05-22 2013-06-22 poor
14 131 2 2013-06-12 2013-07-23 bitten
15 122 23 2013-07-10 2013-08-12 good
16 107 29 2014-01-01 2014-02-14 good
17 110 7 2014-01-11 2014-02-01 good
18 105 2 2014-02-22 2014-03-02 bitten
What is a query I can use to find out which book was borrowed by the oldest borrower?
I am new to SQL and am using Microsoft SQL Server 2014
Here are two different solutions:
First using two sub querys and one equi-join:
select Title
from Books b , Movement m
where b.BookID = m.BookID and m.BorrowerID = (select BorrowerID
from Borrowers
where Birthday = (select MIN(Birthday)
from Borrowers))
Using two equi-joins and one sub query:
select Title
from Books b, Borrowers r, Movement m
where b.BookID = m.BookID
and m.BorrowerID = r.BorrowerID
and Birthday = (select MIN(Birthday) from Borrowers)
Both above queries give the following answer:
Title
------------------------------
Relativity

Calculate average values for rows with different ids in MS Excel

File contains information about products per day, and I need to calculate average values for month for each product.
Source data looks like this:
A B C D
id date rating price
1 1 2014/01/01 2 20
2 1 2014/01/02 2 20
3 1 2014/01/03 2 20
4 1 2014/01/04 1 20
5 1 2014/01/05 1 20
6 1 2014/01/06 1 20
7 1 2014/01/07 1 20
8 3 2014/01/01 5 99
9 3 2014/01/02 5 99
10 3 2014/01/03 5 99
11 3 2014/01/04 5 99
12 3 2014/01/05 5 120
13 3 2014/01/06 5 120
14 3 2014/01/07 5 120
Need to get:
A B C D
id date rating price
1 1 1.42 20
2 3 5 108
How to do that? Need some advanced formula or VB Script.
Update: I have data for long period - about 2 years. Need to calculate average values for each product for each week, and after for each month.
Source data example:
id date rating
4 2013-09-01 445
4 2013-09-02 446
4 2013-09-03 447
4 2013-09-04 448
4 2013-09-05 449
4 2013-09-06 450
4 2013-09-07 451
4 2013-09-08 452
4 2013-09-09 453
4 2013-09-10 454
4 2013-09-11 455
4 2013-09-12 456
4 2013-09-13 457
4 2013-09-14 458
4 2013-09-15 459
4 2013-09-16 460
4 2013-09-17 461
4 2013-09-18 462
4 2013-09-19 463
4 2013-09-20 464
4 2013-09-21 465
4 2013-09-22 466
4 2013-09-23 467
4 2013-09-24 468
4 2013-09-25 469
4 2013-09-26 470
4 2013-09-27 471
4 2013-09-28 472
4 2013-09-29 473
4 2013-09-30 474
4 2013-10-01 475
4 2013-10-02 476
4 2013-10-03 477
4 2013-10-04 478
4 2013-10-05 479
4 2013-10-06 480
4 2013-10-07 481
4 2013-10-08 482
4 2013-10-09 483
4 2013-10-10 484
4 2013-10-11 485
4 2013-10-12 486
4 2013-10-13 487
4 2013-10-14 488
4 2013-10-15 489
4 2013-10-16 490
4 2013-10-17 491
4 2013-10-18 492
4 2013-10-19 493
4 2013-10-20 494
4 2013-10-21 495
4 2013-10-22 496
4 2013-10-23 497
4 2013-10-24 498
4 2013-10-25 499
4 2013-10-26 500
4 2013-10-27 501
4 2013-10-28 502
4 2013-10-29 503
4 2013-10-30 504
4 2013-10-31 505
7 2013-09-01 1445
7 2013-09-02 1446
7 2013-09-03 1447
7 2013-09-04 1448
7 2013-09-05 1449
7 2013-09-06 1450
7 2013-09-07 1451
7 2013-09-08 1452
7 2013-09-09 1453
7 2013-09-10 1454
7 2013-09-11 1455
7 2013-09-12 1456
7 2013-09-13 1457
7 2013-09-14 1458
7 2013-09-15 1459
7 2013-09-16 1460
7 2013-09-17 1461
7 2013-09-18 1462
7 2013-09-19 1463
7 2013-09-20 1464
7 2013-09-21 1465
7 2013-09-22 1466
7 2013-09-23 1467
7 2013-09-24 1468
7 2013-09-25 1469
7 2013-09-26 1470
7 2013-09-27 1471
7 2013-09-28 1472
7 2013-09-29 1473
7 2013-09-30 1474
7 2013-10-01 1475
7 2013-10-02 1476
7 2013-10-03 1477
7 2013-10-04 1478
7 2013-10-05 1479
7 2013-10-06 1480
7 2013-10-07 1481
7 2013-10-08 1482
7 2013-10-09 1483
7 2013-10-10 1484
7 2013-10-11 1485
7 2013-10-12 1486
7 2013-10-13 1487
7 2013-10-14 1488
7 2013-10-15 1489
7 2013-10-16 1490
7 2013-10-17 1491
7 2013-10-18 1492
7 2013-10-19 1493
7 2013-10-20 1494
7 2013-10-21 1495
7 2013-10-22 1496
7 2013-10-23 1497
7 2013-10-24 1498
7 2013-10-25 1499
7 2013-10-26 1500
7 2013-10-27 1501
7 2013-10-28 1502
7 2013-10-29 1503
7 2013-10-30 1504
7 2013-10-31 1505
This is the job of a pivot table, and it takes about 30secs to do it
Update:
as per your update, put the date into the Report Filter and modify to suit