I have a series;
Red 33
Blue 44
Green 22
And also this series;
0 100
1 100
2 100
3 200
4 200
5 200
I want to multiply these in a way to give the following dataframe
Red Blue Green
0 330 440 220
1 330 440 220
2 330 440 220
3 660 880 440
4 660 880 440
5 660 880 440
Can anyone see a simply / tidy way this could be done?
IIUC assuming s is the name of the first series and s1 is the name of the second series, try:
m=s.to_frame().T
pd.DataFrame(m.values*s1.values[:,None],columns=m.columns)
Red Blue Green
0 3300 4400 2200
1 3300 4400 2200
2 3300 4400 2200
3 6600 8800 4400
4 6600 8800 4400
5 6600 8800 4400
Related
I have a dataset that looks something like this. I wish to calculate a modified moving average (column Mod_MA) for sales column based on the following logic :
If there is no event, then ST Else Average last 4 dates.
Date
Item
Event
ST
Mod_MA
2022-10-01
ABC
100
100
2022-10-02
ABC
110
110
2022-10-03
ABC
120
120
2022-10-04
ABC
130
130
2022-10-05
ABC
EV1
140
115
2022-10-06
ABC
EV1
150
119
2022-10-07
ABC
160
160
2022-10-08
ABC
170
170
2022-10-09
ABC
180
180
2022-10-10
ABC
EV2
190
157
2022-10-11
ABC
EV2
200
167
2022-10-12
ABC
EV2
210
168
2022-10-01
XYZ
100
100
2022-10-02
XYZ
110
110
2022-10-03
XYZ
120
120
2022-10-04
XYZ
130
130
2022-10-05
XYZ
EV3
140
115
2022-10-06
XYZ
EV3
150
119
2022-10-07
XYZ
EV3
160
121
2022-10-08
XYZ
170
170
2022-10-09
XYZ
180
180
2022-10-10
XYZ
EV4
190
147
2022-10-11
XYZ
EV4
200
155
2022-10-12
XYZ
210
210
Hopefully the image helps clarify what I am going for.
I have tried LAG & AVG OVER ORDER BY but since I dont have an exact number of iterations I need to run, these dont work.
Calculation Formulae
Would appreciate any help.
I have a data frame like this- Machine Vibration data.
datetime
tagid
value
quality
0
2021-03-01 13:43:41.440
B42
345
192
1
2021-03-01 13:43:41.440
B43
958
192
2
2021-03-01 13:43:41.440
B44
993
192
3
2021-03-01 13:43:41.440
B45
1224
192
4
2021-03-01 13:43:43.527
B188
6665
192
5
2021-03-01 13:43:43.527
B189
7162
192
6
2021-03-01 13:43:43.527
B190
7193
192
7
2021-03-01 13:43:43.747
C29
2975
192
8
2021-03-01 13:43:43.747
C30
4445
192
9
2021-03-01 13:43:43.747
C31
4015
192
I want to convert this to hourly maximum value for each tag id.
Sample Output
datetime
tagid
value
quality
01-03-2021 13:00
C91
3982
192
01-03-2021 14:00
C91
3972
192
01-03-2021 13:00
C92
9000
192
01-03-2021 14:00
C92
9972
192
01-03-2021 13:00
B42
396
192
01-03-2021 14:00
B42
370
192
01-03-2021 15:00
B42
370
192
I tried with grouper, but couldn't get output.
Use Grouper with aggregate max:
df = df.groupby([pd.Grouper(freq='H', key='datetime'), 'tagid']).max().reset_index()
I have following dataframe in pandas
code tank length dia diff
123 3 625 210 -0.38
123 5 635 210 1.2
I want to add 1 only in length for 5 times if the diff is positive and subtract 1 if the dip is negative. My desired dataframe looks like
code tank length diameter
123 3 625 210
123 3 624 210
123 3 623 210
123 3 622 210
123 3 621 210
123 3 620 210
123 5 635 210
123 5 636 210
123 5 637 210
123 5 638 210
123 5 639 210
123 5 640 210
I am doing following in pandas.
df.add(1)
But, its adding 1 to all the columns.
Use Index.repeat 6 times, then add counter values by GroupBy.cumcount and last create default RangeIndex by DataFrame.set_index:
df1 = df.loc[df.index.repeat(6)].copy()
df1['length'] += df1.groupby(level=0).cumcount()
df1 = df1.reset_index(drop=True)
Or:
df1 = (df.loc[df.index.repeat(6)]
.assign(length = lambda x: x.groupby(level=0).cumcount() + x['length'])
.reset_index(drop=True))
print (df1)
code tank length dia
0 123 3 625 210
1 123 3 626 210
2 123 3 627 210
3 123 3 628 210
4 123 3 629 210
5 123 3 630 210
6 123 5 635 210
7 123 5 636 210
8 123 5 637 210
9 123 5 638 210
10 123 5 639 210
11 123 5 640 210
EDIT:
df1 = df.loc[df.index.repeat(6)].copy()
add = df1.groupby(level=0).cumcount()
mask = df1['diff'] < 0
df1['length'] = np.where(mask, df1['length'] - add, df1['length'] + add)
df1 = df1.reset_index(drop=True)
print (df1)
code tank length dia diff
0 123 3 625 210 -0.38
1 123 3 624 210 -0.38
2 123 3 623 210 -0.38
3 123 3 622 210 -0.38
4 123 3 621 210 -0.38
5 123 3 620 210 -0.38
6 123 5 635 210 1.20
7 123 5 636 210 1.20
8 123 5 637 210 1.20
9 123 5 638 210 1.20
10 123 5 639 210 1.20
11 123 5 640 210 1.20
We can use pd.concat, np.cumsum and groupby + .add.
If you want to substract, simply multiply addition * -1 so for example: (np.cumsum(np.ones(n))-1) * -1
n = 6
new = pd.concat([df]*n).sort_values(['code', 'length']).reset_index(drop=True)
addition = np.cumsum(np.ones(n))-1
new['length'] = new.groupby(['code', 'tank'])['length'].apply(lambda x: x.add(addition))
Output
code tank length dia
0 123 3 625.0 210
1 123 3 626.0 210
2 123 3 627.0 210
3 123 3 628.0 210
4 123 3 629.0 210
5 123 3 630.0 210
6 123 5 635.0 210
7 123 5 636.0 210
8 123 5 637.0 210
9 123 5 638.0 210
10 123 5 639.0 210
11 123 5 640.0 210
I have a following dataframe
code date time product tank stock out_value
123 2019-06-20 07:00 MS 1 370 350
123 2019-06-20 07:30 HS 3 340 350
123 2019-06-20 07:00 MS 2 340 350
123 2019-06-20 07:30 HS 4 340 350
123 2019-06-20 08:00 MS 1 470 350
123 2019-06-20 08:30 HS 3 450 350
123 2019-06-20 08:00 MS 2 470 350
123 2019-06-20 08:30 HS 4 490 350
123 2019-06-20 09:30 HS 4 0 350
234 2019-06-20 09:30 HS 1 200 350
I want to find out which stock values are less than out_value in above dataframe excluding 0 value.
e.g. at 07:30 for ro code 123 on date 2019-06-20 for product HS there are two tanks 3 and 4, so if stocks for both the tanks are below out_value then flag is set to 1.
My desired dataframe would be
code date time product tank stock out_value flag
123 2019-06-20 07:00 MS 1 370 350 0
123 2019-06-20 07:30 HS 3 340 350 1
123 2019-06-20 07:00 MS 2 340 350 0
123 2019-06-20 07:30 HS 4 340 350 1
123 2019-06-20 08:00 MS 1 470 350 0
123 2019-06-20 08:30 HS 3 450 350 0
123 2019-06-20 08:00 MS 2 470 350 0
123 2019-06-20 08:30 HS 4 490 350 0
123 2019-06-20 09:30 HS 4 0 350 0
234 2019-06-20 09:30 HS 1 200 350 1
How can I do it in pandas?
If need check difference with non 0 values and then check all True values per groups with GroupBy.transform and GroupBy.all:
df['flag'] = ((df['stock']<df['out_value']) & (df['stock'] !=0))
df['flag'] = df.groupby(['code','date','time','product'])['flag'].transform('all').astype(int)
print (df)
code date time product tank stock out_value flag
0 123 2019-06-20 07:00 MS 1 370 350 0
1 123 2019-06-20 07:30 HS 3 340 350 1
2 123 2019-06-20 07:00 MS 2 340 350 0
3 123 2019-06-20 07:30 HS 4 340 350 1
4 123 2019-06-20 08:00 MS 1 470 350 0
5 123 2019-06-20 08:30 HS 3 450 350 0
6 123 2019-06-20 08:00 MS 2 470 350 0
7 123 2019-06-20 08:30 HS 4 490 350 0
8 123 2019-06-20 09:30 HS 4 0 350 0
9 234 2019-06-20 09:30 HS 1 200 350 1
Or if need test only difference, test per groups and last chain with mask for test non 0 values:
df['flag'] = df['stock']<df['out_value']
mask = df.groupby(['code','date','time','product'])['flag'].transform('all')
df['flag'] = (mask & (df['stock'] !=0)).astype(int)
This should do it:
df['flag'] = (df.assign(flag=(df.stock<df.out_value)&(df.stock>0))
.groupby(['code', 'date', 'time', 'product'], as_index=False)['flag']
.transform(all)
.astype(int))
df
code date time product tank stock out_value flag
0 123 2019-06-20 07:00 MS 1 370 350 0
1 123 2019-06-20 07:30 HS 3 340 350 1
2 123 2019-06-20 07:00 MS 2 340 350 0
3 123 2019-06-20 07:30 HS 4 340 350 1
4 123 2019-06-20 08:00 MS 1 470 350 0
5 123 2019-06-20 08:30 HS 3 450 350 0
6 123 2019-06-20 08:00 MS 2 470 350 0
7 123 2019-06-20 08:30 HS 4 490 350 0
8 123 2019-06-20 09:30 HS 4 0 350 0
9 234 2019-06-20 09:30 HS 1 200 350 1
you could do, it gives(I guess) the right result for the dataframe you provided, but I'm not sure if that's what you want.
df['flag'] = ((df['stock']<df['out_value']) & (df['stock'] !=0)).astype(int)
To me, it is quite unclear what you're asking. If you want to flag as 1, all the rows which has stock below to out_value, except if they are 0, you can do...
df['flag'] = 0
df.loc[(df['stock'] < df['out_value']) & (df['stock'] != 0), 'flag'] = 1
The database data is from http://www.w3resource.com/mysql-exercises/join-exercises/
sqlite> select * from employees;
EMPLOYEE_ID FIRST_NAME LAST_NAME EMAIL PHONE_NUMBER HIRE_DATE JOB_ID SALARY COMMISSION_PCT MANAGER_ID DEPARTMENT_ID
----------- ------------- ------------- ---------- -------------------- ------------ ------------ ---------- -------------- ---------- -------------
100 Steven King SKING 515.123.4567 1987-06-17 AD_PRES 24000 0.0 0 90
101 Neena Kochhar NKOCHHAR 515.123.4568 1987-06-18 AD_VP 17000 0.0 100 90
102 Lex De Haan LDEHAAN 515.123.4569 1987-06-19 AD_VP 17000 0.0 100 90
...
202 Pat Fay PFAY 603.123.6666 1987-09-27 MK_REP 6000 0.0 201 20
203 Susan Mavris SMAVRIS 515.123.7777 1987-09-28 HR_REP 6500 0.0 101 40
204 Hermann Baer HBAER 515.123.8888 1987-09-29 PR_REP 10000 0.0 101 70
205 Shelley Higgins SHIGGINS 515.123.8080 1987-09-30 AC_MGR 12000 0.0 101 110
206 William Gietz WGIETZ 515.123.8181 1987-10-01 AC_ACCOUNT 8300 0.0 205 110
sqlite> select * from departments;
DEPARTMENT_ID DEPARTMENT_NAME MANAGER_ID LOCATION_ID
------------- ---------------------- ---------- -----------
10 Administration 200 1700
20 Marketing 201 1800
30 Purchasing 114 1700
40 Human Resources 203 2400
50 Shipping 121 1500
60 IT 103 1400
70 Public Relations 204 2700
80 Sales 145 2500
90 Executive 100 1700
100 Finance 108 1700
110 Accounting 205 1700
120 Treasury 0 1700
130 Corporate Tax 0 1700
140 Control And Credit 0 1700
150 Shareholder Services 0 1700
160 Benefits 0 1700
170 Manufacturing 0 1700
180 Construction 0 1700
190 Contracting 0 1700
200 Operations 0 1700
210 IT Support 0 1700
220 NOC 0 1700
230 IT Helpdesk 0 1700
240 Government Sales 0 1700
250 Retail Sales 0 1700
260 Recruiting 0 1700
270 Payroll 0 1700
The natural join result:
sqlite> select * from employees e natural join departments d;
EMPLOYEE_ID FIRST_NAME LAST_NAME EMAIL PHONE_NUMBER HIRE_DATE JOB_ID SALARY COMMISSION_PCT MANAGER_ID DEPARTMENT_ID DEPARTMENT_NAME LOCATION_ID
----------- ------------- ------------- ---------- -------------------- ------------ ------------ ---------- -------------- ---------- ------------- ---------------------- -----------
101 Neena Kochhar NKOCHHAR 515.123.4568 1987-06-18 AD_VP 17000 0.0 100 90 Executive 1700
102 Lex De Haan LDEHAAN 515.123.4569 1987-06-19 AD_VP 17000 0.0 100 90 Executive 1700
104 Bruce Ernst BERNST 590.423.4568 1987-06-21 IT_PROG 6000 0.0 103 60 IT 1400
105 David Austin DAUSTIN 590.423.4569 1987-06-22 IT_PROG 4800 0.0 103 60 IT 1400
106 Valli Pataballa VPATABAL 590.423.4560 1987-06-23 IT_PROG 4800 0.0 103 60 IT 1400
107 Diana Lorentz DLORENTZ 590.423.5567 1987-06-24 IT_PROG 4200 0.0 103 60 IT 1400
109 Daniel Faviet DFAVIET 515.124.4169 1987-06-26 FI_ACCOUNT 9000 0.0 108 100 Finance 1700
110 John Chen JCHEN 515.124.4269 1987-06-27 FI_ACCOUNT 8200 0.0 108 100 Finance 1700
111 Ismael Sciarra ISCIARRA 515.124.4369 1987-06-28 FI_ACCOUNT 7700 0.0 108 100 Finance 1700
112 Jose Manuel Urman JMURMAN 515.124.4469 1987-06-29 FI_ACCOUNT 7800 0.0 108 100 Finance 1700
113 Luis Popp LPOPP 515.124.4567 1987-06-30 FI_ACCOUNT 6900 0.0 108 100 Finance 1700
115 Alexander Khoo AKHOO 515.127.4562 1987-07-02 PU_CLERK 3100 0.0 114 30 Purchasing 1700
116 Shelli Baida SBAIDA 515.127.4563 1987-07-03 PU_CLERK 2900 0.0 114 30 Purchasing 1700
117 Sigal Tobias STOBIAS 515.127.4564 1987-07-04 PU_CLERK 2800 0.0 114 30 Purchasing 1700
118 Guy Himuro GHIMURO 515.127.4565 1987-07-05 PU_CLERK 2600 0.0 114 30 Purchasing 1700
119 Karen Colmenares KCOLMENA 515.127.4566 1987-07-06 PU_CLERK 2500 0.0 114 30 Purchasing 1700
129 Laura Bissot LBISSOT 650.124.5234 1987-07-16 ST_CLERK 3300 0.0 121 50 Shipping 1500
130 Mozhe Atkinson MATKINSO 650.124.6234 1987-07-17 ST_CLERK 2800 0.0 121 50 Shipping 1500
131 James Marlow JAMRLOW 650.124.7234 1987-07-18 ST_CLERK 2500 0.0 121 50 Shipping 1500
132 TJ Olson TJOLSON 650.124.8234 1987-07-19 ST_CLERK 2100 0.0 121 50 Shipping 1500
150 Peter Tucker PTUCKER 011.44.1344.129268 1987-08-06 SA_REP 10000 0.3 145 80 Sales 2500
151 David Bernstein DBERNSTE 011.44.1344.345268 1987-08-07 SA_REP 9500 0.25 145 80 Sales 2500
152 Peter Hall PHALL 011.44.1344.478968 1987-08-08 SA_REP 9000 0.25 145 80 Sales 2500
153 Christopher Olsen COLSEN 011.44.1344.498718 1987-08-09 SA_REP 8000 0.2 145 80 Sales 2500
154 Nanette Cambrault NCAMBRAU 011.44.1344.987668 1987-08-10 SA_REP 7500 0.2 145 80 Sales 2500
155 Oliver Tuvault OTUVAULT 011.44.1344.486508 1987-08-11 SA_REP 7000 0.15 145 80 Sales 2500
184 Nandita Sarchand NSARCHAN 650.509.1876 1987-09-09 SH_CLERK 4200 0.0 121 50 Shipping 1500
185 Alexis Bull ABULL 650.509.2876 1987-09-10 SH_CLERK 4100 0.0 121 50 Shipping 1500
186 Julia Dellinger JDELLING 650.509.3876 1987-09-11 SH_CLERK 3400 0.0 121 50 Shipping 1500
187 Anthony Cabrio ACABRIO 650.509.4876 1987-09-12 SH_CLERK 3000 0.0 121 50 Shipping 1500
202 Pat Fay PFAY 603.123.6666 1987-09-27 MK_REP 6000 0.0 201 20 Marketing 1800
206 William Gietz WGIETZ 515.123.8181 1987-10-01 AC_ACCOUNT 8300 0.0 205 110 Accounting 1700
sqlite> select count(*) from employees e natural join departments d;
count(*)
----------
32
The join result:
sqlite> select * from employees e join departments d using
(department_id);
EMPLOYEE_ID FIRST_NAME LAST_NAME EMAIL PHONE_NUMBER HIRE_DATE JOB_ID SALARY COMMISSION_PCT MANAGER_ID DEPARTMENT_ID DEPARTMENT_NAME MANAGER_ID LOCATION_ID
----------- ------------- ------------- ---------- -------------------- ------------ ------------ ---------- -------------- ---------- ------------- ---------------------- ---------- -----------
100 Steven King SKING 515.123.4567 1987-06-17 AD_PRES 24000 0.0 0 90 Executive 100 1700
101 Neena Kochhar NKOCHHAR 515.123.4568 1987-06-18 AD_VP 17000 0.0 100 90 Executive 100 1700
102 Lex De Haan LDEHAAN 515.123.4569 1987-06-19 AD_VP 17000 0.0 100 90 Executive 100 1700
103 Alexander Hunold AHUNOLD 590.423.4567 1987-06-20 IT_PROG 9000 0.0 102 60 IT 103 1400
104 Bruce Ernst BERNST 590.423.4568 1987-06-21 IT_PROG 6000 0.0 103 60 IT 103 1400
105 David Austin DAUSTIN 590.423.4569 1987-06-22 IT_PROG 4800 0.0 103 60 IT 103 1400
106 Valli Pataballa VPATABAL 590.423.4560 1987-06-23 IT_PROG 4800 0.0 103 60 IT 103 1400
107 Diana Lorentz DLORENTZ 590.423.5567 1987-06-24 IT_PROG 4200 0.0 103 60 IT 103 1400
108 Nancy Greenberg NGREENBE 515.124.4569 1987-06-25 FI_MGR 12000 0.0 101 100 Finance 108 1700
109 Daniel Faviet DFAVIET 515.124.4169 1987-06-26 FI_ACCOUNT 9000 0.0 108 100 Finance 108 1700
110 John Chen JCHEN 515.124.4269 1987-06-27 FI_ACCOUNT 8200 0.0 108 100 Finance 108 1700
111 Ismael Sciarra ISCIARRA 515.124.4369 1987-06-28 FI_ACCOUNT 7700 0.0 108 100 Finance 108 1700
112 Jose Manuel Urman JMURMAN 515.124.4469 1987-06-29 FI_ACCOUNT 7800 0.0 108 100 Finance 108 1700
...
155 Oliver Tuvault OTUVAULT 011.44.1344.486508 1987-08-11 SA_REP 7000 0.15 145 80 Sales 145 2500
156 Janette King JKING 011.44.1345.429268 1987-08-12 SA_REP 10000 0.35 146 80 Sales 145 2500
157 Patrick Sully PSULLY 011.44.1345.929268 1987-08-13 SA_REP 9500 0.35 146 80 Sales 145 2500
158 Allan McEwen AMCEWEN 011.44.1345.829268 1987-08-14 SA_REP 9000 0.35 146 80 Sales 145 2500
159 Lindsey Smith LSMITH 011.44.1345.729268 1987-08-15 SA_REP 8000 0.3 146 80 Sales 145 2500
160 Louise Doran LDORAN 011.44.1345.629268 1987-08-16 SA_REP 7500 0.3 146 80 Sales 145 2500
161 Sarath Sewall SSEWALL 011.44.1345.529268 1987-08-17 SA_REP 7000 0.25 146 80 Sales 145 2500
162 Clara Vishney CVISHNEY 011.44.1346.129268 1987-08-18 SA_REP 10500 0.25 147 80 Sales 145 2500
163 Danielle Greene DGREENE 011.44.1346.229268 1987-08-19 SA_REP 9500 0.15 147 80 Sales 145 2500
164 Mattea Marvins MMARVINS 011.44.1346.329268 1987-08-20 SA_REP 7200 0.1 147 80 Sales 145 2500
165 David Lee DLEE 011.44.1346.529268 1987-08-21 SA_REP 6800 0.1 147 80 Sales 145 2500
166 Sundar Ande SANDE 011.44.1346.629268 1987-08-22 SA_REP 6400 0.1 147 80 Sales 145 2500
167 Amit Banda ABANDA 011.44.1346.729268 1987-08-23 SA_REP 6200 0.1 147 80 Sales 145 2500
168 Lisa Ozer LOZER 011.44.1343.929268 1987-08-24 SA_REP 11500 0.25 148 80 Sales 145 2500
169 Harrison Bloom HBLOOM 011.44.1343.829268 1987-08-25 SA_REP 10000 0.2 148 80 Sales 145 2500
170 Tayler Fox TFOX 011.44.1343.729268 1987-08-26 SA_REP 9600 0.2 148 80 Sales 145 2500
171 William Smith WSMITH 011.44.1343.629268 1987-08-27 SA_REP 7400 0.15 148 80 Sales 145 2500
172 Elizabeth Bates EBATES 011.44.1343.529268 1987-08-28 SA_REP 7300 0.15 148 80 Sales 145 2500
173 Sundita Kumar SKUMAR 011.44.1343.329268 1987-08-29 SA_REP 6100 0.1 148 80 Sales 145 2500
174 Ellen Abel EABEL 011.44.1644.429267 1987-08-30 SA_REP 11000 0.3 149 80 Sales 145 2500
175 Alyssa Hutton AHUTTON 011.44.1644.429266 1987-08-31 SA_REP 8800 0.25 149 80 Sales 145 2500
176 Jonathon Taylor JTAYLOR 011.44.1644.429265 1987-09-01 SA_REP 8600 0.2 149 80 Sales 145 2500
177 Jack Livingston JLIVINGS 011.44.1644.429264 1987-09-02 SA_REP 8400 0.2 149 80 Sales 145 2500
179 Charles Johnson CJOHNSON 011.44.1644.429262 1987-09-04 SA_REP 6200 0.1 149 80 Sales 145 2500
180 Winston Taylor WTAYLOR 650.507.9876 1987-09-05 SH_CLERK 3200 0.0 120 50 Shipping 121 1500
181 Jean Fleaur JFLEAUR 650.507.9877 1987-09-06 SH_CLERK 3100 0.0 120 50 Shipping 121 1500
182 Martha Sullivan MSULLIVA 650.507.9878 1987-09-07 SH_CLERK 2500 0.0 120 50 Shipping 121 1500
183 Girard Geoni GGEONI 650.507.9879 1987-09-08 SH_CLERK 2800 0.0 120 50 Shipping 121 1500
184 Nandita Sarchand NSARCHAN 650.509.1876 1987-09-09 SH_CLERK 4200 0.0 121 50 Shipping 121 1500
185 Alexis Bull ABULL 650.509.2876 1987-09-10 SH_CLERK 4100 0.0 121 50 Shipping 121 1500
186 Julia Dellinger JDELLING 650.509.3876 1987-09-11 SH_CLERK 3400 0.0 121 50 Shipping 121 1500
187 Anthony Cabrio ACABRIO 650.509.4876 1987-09-12 SH_CLERK 3000 0.0 121 50 Shipping 121 1500
188 Kelly Chung KCHUNG 650.505.1876 1987-09-13 SH_CLERK 3800 0.0 122 50 Shipping 121 1500
189 Jennifer Dilly JDILLY 650.505.2876 1987-09-14 SH_CLERK 3600 0.0 122 50 Shipping 121 1500
190 Timothy Gates TGATES 650.505.3876 1987-09-15 SH_CLERK 2900 0.0 122 50 Shipping 121 1500
191 Randall Perkins RPERKINS 650.505.4876 1987-09-16 SH_CLERK 2500 0.0 122 50 Shipping 121 1500
192 Sarah Bell SBELL 650.501.1876 1987-09-17 SH_CLERK 4000 0.0 123 50 Shipping 121 1500
193 Britney Everett BEVERETT 650.501.2876 1987-09-18 SH_CLERK 3900 0.0 123 50 Shipping 121 1500
194 Samuel McCain SMCCAIN 650.501.3876 1987-09-19 SH_CLERK 3200 0.0 123 50 Shipping 121 1500
195 Vance Jones VJONES 650.501.4876 1987-09-20 SH_CLERK 2800 0.0 123 50 Shipping 121 1500
196 Alana Walsh AWALSH 650.507.9811 1987-09-21 SH_CLERK 3100 0.0 124 50 Shipping 121 1500
197 Kevin Feeney KFEENEY 650.507.9822 1987-09-22 SH_CLERK 3000 0.0 124 50 Shipping 121 1500
198 Donald OConnell DOCONNEL 650.507.9833 1987-09-23 SH_CLERK 2600 0.0 124 50 Shipping 121 1500
199 Douglas Grant DGRANT 650.507.9844 1987-09-24 SH_CLERK 2600 0.0 124 50 Shipping 121 1500
200 Jennifer Whalen JWHALEN 515.123.4444 1987-09-25 AD_ASST 4400 0.0 101 10 Administration 200 1700
201 Michael Hartstein MHARTSTE 515.123.5555 1987-09-26 MK_MAN 13000 0.0 100 20 Marketing 201 1800
202 Pat Fay PFAY 603.123.6666 1987-09-27 MK_REP 6000 0.0 201 20 Marketing 201 1800
203 Susan Mavris SMAVRIS 515.123.7777 1987-09-28 HR_REP 6500 0.0 101 40 Human Resources 203 2400
204 Hermann Baer HBAER 515.123.8888 1987-09-29 PR_REP 10000 0.0 101 70 Public Relations 204 2700
205 Shelley Higgins SHIGGINS 515.123.8080 1987-09-30 AC_MGR 12000 0.0 101 110 Accounting 205 1700
206 William Gietz WGIETZ 515.123.8181 1987-10-01 AC_ACCOUNT 8300 0.0 205 110 Accounting 205 1700
sqlite> select count(*) from employees e join departments d using (department_id);
count(*)
----------
106
The natural join result rows count should be same as join, but not, why?
The different between a natural join and a 'normal' join is that the former use all columns that happen to have the same name in both tables.
In this case, both DEPARTMENT_ID and MANAGER_ID match, so the natural join is actually the same as this query:
select * from employees e join departments d using (department_id, manager_id);
This is why you should never, ever use a natural join.