Loss nan problem when using TFBertForSequenceClassification

Loss nan problem when using TFBertForSequenceClassification - tensorflow

I have a problem when training a model for multi-label text classification.
I'm working at Colab as follows:
def create_sentiment_bert():
config = BertConfig.from_pretrained("monologg/kobert", num_labels=52)
model = TFBertForSequenceClassification.from_pretrained("monologg/kobert", config=config, from_pt=True)
opt = tf.keras.optimizers.Adam(learning_rate=4.0e-6)
loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False)
metric = tf.keras.metrics.SparseCategoricalAccuracy("accuracy")
model.compile(optimizer=opt, loss=loss, metrics=[metric])
return model
sentiment_model = create_sentiment_bert()
sentiment_model.fit(train_x, train_y, epochs=2, shuffle=True, batch_size=250, validation_data=(test_x, test_y))
The result is as follows:
Epoch 1/2
739/14065 [>.............................] - ETA: 35:31 - loss: nan - accuracy: 0.0000e+00
I have checked out my data: no nan or null or invalid values.
I tried different optimizers, # of epochs, learning rate, but had the same problem.
The number of labels is 52 and the distribution is as follows:
[Label] [Count]
501 694624
601 651306
401 257665
210 250352
307 170665
301 153318
306 147948
201 141382
302 113917
402 102040
606 101434
506 73492
305 69876
604 62056
403 57956
104 56800
107 55503
607 40293
503 36272
505 34757
303 26884
308 24539
304 22135
205 20744
509 19465
206 16665
508 15334
208 13335
603 13240
504 12299
602 10684
202 10366
209 8267
106 6564
502 5880
211 5804
207 2794
507 1967
108 1860
204 1633
105 1545
109 682
605 426
102 276
101 274
405 268
212 204
213 153
103 103
203 90
404 65
608 37
I'm a beginner in this area. Please help me. Thanks in advance!

Why do you have from_logits = False? The classifier head returns logits so unless you put a softmax activation within your model, you need to calculate the loss from logits.

Related

How to solve an error, "module 'numpy' has no attribute 'float'"?

Circumstance
WSL2
Docker
Virtualenv
Python 3.8.16
jupyterlab 3.5.2
numpy 1.24.1
prophet 1.1.1
fbprophet 0.7.1
Cython 0.29.33
ipython 8.8.0
pmdarima 2.0.2
plotly 5.11.0
pip 22.3.1
pystan 2.19.1.1
scikit-learn 1.2.0
konlpy 0.6.0 (just in the case)
nodejs 0.1.1 (just in the case)
pandas 1.5.2 (just in the case)
Error
main error message
AttributeError: module 'numpy' has no attribute 'float'
entire error message
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[33], line 4
1 # Prophet() 모델을 읽어와서
2 # fit로 학습한다.
3 model_revenue = Prophet()
----> 4 model_revenue.fit(revenue_serial)
File /home/.venv/lib/python3.8/site-packages/fbprophet/forecaster.py:1115, in Prophet.fit(self, df, **kwargs)
1112 self.history = history
1113 self.set_auto_seasonalities()
1114 seasonal_features, prior_scales, component_cols, modes = (
-> 1115 self.make_all_seasonality_features(history))
1116 self.train_component_cols = component_cols
1117 self.component_modes = modes
File /home/.venv/lib/python3.8/site-packages/fbprophet/forecaster.py:765, in Prophet.make_all_seasonality_features(self, df)
763 # Seasonality features
764 for name, props in self.seasonalities.items():
--> 765 features = self.make_seasonality_features(
766 df['ds'],
767 props['period'],
768 props['fourier_order'],
769 name,
770 )
771 if props['condition_name'] is not None:
772 features[~df[props['condition_name']]] = 0
File /home/.venv/lib/python3.8/site-packages/fbprophet/forecaster.py:458, in Prophet.make_seasonality_features(cls, dates, period, series_order, prefix)
442 #classmethod
443 def make_seasonality_features(cls, dates, period, series_order, prefix):
444 """Data frame with seasonality features.
445
446 Parameters
(...)
456 pd.DataFrame with seasonality features.
457 """
--> 458 features = cls.fourier_series(dates, period, series_order)
459 columns = [
460 '{}_delim_{}'.format(prefix, i + 1)
461 for i in range(features.shape[1])
462 ]
463 return pd.DataFrame(features, columns=columns)
File /home/.venv/lib/python3.8/site-packages/fbprophet/forecaster.py:434, in Prophet.fourier_series(dates, period, series_order)
417 """Provides Fourier series components with the specified frequency
418 and order.
419
(...)
428 Matrix with seasonality features.
429 """
430 # convert to days since epoch
431 t = np.array(
432 (dates - datetime(1970, 1, 1))
433 .dt.total_seconds()
--> 434 .astype(np.float)
435 ) / (3600 * 24.)
436 return np.column_stack([
437 fun((2.0 * (i + 1) * np.pi * t / period))
438 for i in range(series_order)
439 for fun in (np.sin, np.cos)
440 ])
File /home/.venv/lib/python3.8/site-packages/numpy/__init__.py:284, in __getattr__(attr)
281 from .testing import Tester
282 return Tester
--> 284 raise AttributeError("module {!r} has no attribute "
285 "{!r}".format(__name__, attr))
AttributeError: module 'numpy' has no attribute 'float'
Example of dataset
ds y
0 2022-09-01 13:00:00 762
1 2022-09-01 15:00:00 746
2 2022-09-01 17:00:00 848
3 2022-09-01 19:00:00 866
4 2022-09-01 21:00:00 632
... ... ...
1881 2022-10-31 13:00:00 684
1882 2022-10-31 15:00:00 749
1883 2022-10-31 17:00:00 779
1884 2022-10-31 19:00:00 573
1885 2022-10-31 21:00:00 510
Type of variable
visitors_serial
ds datetime64[ns]
y int64
dtype: object
Short code
...
revenue_serial = pd.DataFrame(pd.to_datetime(df_active_time['START_DATE'], format="%Y%m%d %H:%M:%S"))
revenue_serial['객단가(원)']=df_active_time['객단가(원)']
revenue_serial = revenue_serial.reset_index(drop= True)
revenue_serial = revenue_serial.rename(columns={'START_DATE':'ds', '객단가(원)':'y'})
model_revenue = Prophet().
model_revenue.fit(revenue_serial)
I expected if I do upgrade the version of numpy module, it would be solved. It doesn't happend to solve

you could see it in your code the error numpy actually has no attribute float your code is t = np.array((dates - datetime(1970, 1, 1)).dt.total_seconds().astype(np.float) it should be
t = np.array(
(dates - datetime(1970, 1, 1))
.dt.total_seconds()
.astype(np.float32)

The alias numpy.float was deprecated in NumPy 1.20 and was removed in NumPy 1.24.
You can change it to numpy.float_, numpy.float64, or numpy.double. They all mean the same thing.
For your dependency prophet, the actual issue was already fixed in #1850 (March 2021), and it does appear to be fixed in v1.1.1 so it looks like you're not running the version you think you are.

Removing duplicates based on matching column values with boolean indexing

After merging two DF's I have the following dataset:
DB_ID
x_val
y_val
x01
405
407
x01
405
405
x02
308
306
x02
308
308
x03
658
658
x03
658
660
x04
None
658
x04
None
660
x05
658
660
x06
660
660
The y table contains multiple values for the left join variable (not included in table), resulting in multiple rows per unique DB_ID (string variable, not in df index).
The issue is that only one row is correct, where x_val and y_val match. I tried removing the duplicates with the following code:
df= df[~df['DB_ID'].duplicated() | combined['x_val'] != combined['y_val']]
This however doesn't work. I am looking for a solution to achieve the following result:
DB_ID
x_val
y_val
x01
405
405
x02
308
308
x03
658
658
x04
None
658
x05
658
660
x06
660
660

Idea is compare both column for not equal, then sorting and reove duplicates by DB_ID:
df = (df.assign(new = df['x_val'].ne(df['y_val']))
.sort_values(['DB_ID','new'])
.drop_duplicates('DB_ID')
.drop('new', axis=1))
print (df)
DB_ID x_val y_val
1 x01 405 405
3 x02 308 308
4 x03 658 658
6 x04 None 658
8 x05 658 660
9 x06 660 660
If need equal NaNs or Nones use:
df = (df.assign(new = df['x_val'].fillna('same').ne(df['y_val'].fillna('same')))
.sort_values(['DB_ID','new'])
.drop_duplicates('DB_ID')
.drop('new', axis=1))

Maybe, you can simply use:
df = df[df['x_val'] == df['y_val']]
print(df)
# Output
DB_ID x_val y_val
1 x01 405 405
3 x02 308 308
4 x03 658 658
I think you don't need drop_duplicates or duplicated but if you want to ensure there remains only one instance of each DB_ID, you can append .drop_duplicates('DB_ID')
df = df[df['x_val'] == df['y_val']].drop_duplicates('DB_ID')
print(df)
# Output
DB_ID x_val y_val
1 x01 405 405
3 x02 308 308
4 x03 658 658

How to plot using timstamp and coordinates?

I have logs of mouse movement that is coordinates and timestamp .I want to plot the mouse movement using this log how can I do this I have no idea what API or what can be used to do the same.I want to know how start with if there is some way which exist.
My log is as follows
Date hr:min:sec ms x y
13/6/2020 13:13:33 521 291 283
13/6/2020 13:13:33 638 273 234
13/6/2020 13:13:33 647 272 233
13/6/2020 13:13:33 657 271 231
13/6/2020 13:13:33 667 269 230
13/6/2020 13:13:33 677 268 229
13/6/2020 13:13:33 687 267 228
13/6/2020 13:13:33 697 264 226

You're looking for geom_path() from ggplot2. The geom will connect a line between all your observations based on the order they appear in the dataframe. So, here's some x,y data that's expanded a bit:
df <- data.frame(
x=c(291,273,272,271,269,268,267,264,262,261,261,265,268,280,290),
y=c(283,234,233,231,230,229,228,226,230,235,237,248,252,246,235)
)
And some code to make a simple plot using geom_path():
p <- ggplot(df, aes(x=x,y=y)) + theme_classic() +
geom_path(color='blue') + geom_point()
p
If you want, you can even save that as an animation based on your time points. See the code below using the gganimate package:
library(gganimate)
df$time <- 1:15
a <- p + transition_reveal(time)
animate(a, fps=20)

Gurobi Warning and Inconsistency in Optimal Value [some integer variables take values larger than the maximum supported value (2000000000)]

I am using Gurobi version 8.1.0 and Python API version 3.6 to solve MIP problems. I have two models in which I believe that their global optima are equal. However, I found out that they are not equal in one of my simulations. I then tried to warm-start the model that I believe the solution is incorrect (model-1) with the solution from another model (model-2). In other words, the problem is to maximize the objective function and the objective value of model-1 is 42.3333, but I believe it should be 42.8333. Therefore, I use the solution from model-2 with the objective value of 42.8333 to warm-start to model-1.
What is weird is that the solution from model-2 should not be feasible to model-1 since the objective value is greater than 42.3333 and the problem is maximization. However, it turns out that it is a feasible warm start and now the optimal value of model-1 is 42.8333. How can the same model have multiple optima?
Changed value of parameter timeLimit to 10800.0
Prev: 1e+100 Min: 0.0 Max: 1e+100 Default: 1e+100
Changed value of parameter LogFile to output/inconsistent_Model-1.log
Prev: gurobi.log Default:
Optimize a model with 11277 rows, 15150 columns and 165637 nonzeros
Model has 5050 general constraints
Variable types: 0 continuous, 15150 integer (5050 binary)
Coefficient statistics:
Matrix range [1e+00, 1e+00]
Objective range [1e-02, 1e+00]
Bounds range [1e+00, 1e+00]
RHS range [1e+00, 5e+01]
Presolve removed 6167 rows and 7008 columns
Presolve time: 0.95s
Presolved: 5110 rows, 8142 columns, 37608 nonzeros
Presolved model has 3058 SOS constraint(s)
Variable types: 0 continuous, 8142 integer (4403 binary)
Warning: Markowitz tolerance tightened to 0.0625
Warning: Markowitz tolerance tightened to 0.125
Warning: Markowitz tolerance tightened to 0.25
Warning: Markowitz tolerance tightened to 0.5
Root relaxation: objective 4.333333e+01, 4856 iterations, 2.15 seconds
Nodes | Current Node | Objective Bounds | Work
Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time
0 0 43.33333 0 587 - 43.33333 - - 3s
0 0 43.26667 0 243 - 43.26667 - - 4s
0 0 43.20000 0 1282 - 43.20000 - - 4s
0 0 43.20000 0 567 - 43.20000 - - 4s
0 0 43.18333 0 1114 - 43.18333 - - 5s
0 0 43.16543 0 2419 - 43.16543 - - 5s
0 0 43.15556 0 1575 - 43.15556 - - 5s
0 0 43.15333 0 2271 - 43.15333 - - 5s
0 0 43.13333 0 727 - 43.13333 - - 5s
0 0 43.12778 0 1698 - 43.12778 - - 5s
0 0 43.12500 0 1146 - 43.12500 - - 5s
0 0 43.12500 0 1911 - 43.12500 - - 6s
0 0 43.11927 0 1859 - 43.11927 - - 6s
0 0 43.11845 0 2609 - 43.11845 - - 7s
0 0 43.11845 0 2631 - 43.11845 - - 7s
0 0 43.11845 0 2642 - 43.11845 - - 7s
0 0 43.11845 0 2462 - 43.11845 - - 8s
0 0 43.11845 0 2529 - 43.11845 - - 8s
0 0 43.11845 0 2529 - 43.11845 - - 9s
0 2 43.11845 0 2531 - 43.11845 - - 14s
41 35 43.09874 17 957 - 43.09874 - 29.4 15s
94 84 42.93207 33 716 - 43.09874 - 22.1 31s
117 101 42.91940 40 2568 - 43.09874 - 213 37s
264 175 infeasible 92 - 43.09874 - 133 73s
273 181 infeasible 97 - 43.09874 - 277 77s
293 191 42.42424 17 1828 - 43.09874 - 280 90s
369 249 42.40111 52 2633 - 43.09874 - 311 105s
383 257 42.39608 59 3062 - 43.09874 - 329 152s
408 265 42.39259 65 2819 - 43.09874 - 386 162s
419 274 41.51399 66 2989 - 43.09874 - 401 170s
454 282 41.29938 71 3000 - 43.09874 - 390 182s
462 280 infeasible 74 - 43.09874 - 423 192s
479 287 infeasible 78 - 43.09874 - 419 204s
498 293 40.51287 81 2564 - 43.09874 - 435 207s
526 307 40.16638 86 2619 - 43.09874 - 419 227s
584 330 42.63100 33 621 - 43.09874 - 404 236s
628 333 infeasible 37 - 43.09874 - 394 252s
661 345 42.37500 26 25 - 43.09874 - 396 288s
684 353 infeasible 30 - 43.09874 - 426 290s
842 370 infeasible 69 - 43.09874 - 348 306s
944 379 infeasible 86 - 43.09874 - 321 370s
1009 395 42.36667 22 25 - 43.09874 - 350 409s
* 1031 243 3 42.3333333 43.09874 1.81% 343 409s
1056 203 43.00000 19 141 42.33333 43.09874 1.81% 362 411s
1194 222 cutoff 23 42.33333 43.00000 1.57% 325 430s
1199 219 cutoff 25 42.33333 43.00000 1.57% 349 450s
1202 212 cutoff 29 42.33333 43.00000 1.57% 361 472s
1211 200 infeasible 47 42.33333 42.91851 1.38% 380 498s
1226 169 infeasible 43 42.33333 42.91471 1.37% 395 511s
Cutting planes:
Gomory: 2
Cover: 15
Implied bound: 1
Clique: 26
MIR: 17
Inf proof: 1
Zero half: 8
Explored 1426 nodes (502432 simplex iterations) in 512.68 seconds
Thread count was 4 (of 4 available processors)
Solution count 1: 42.3333
Optimal solution found (tolerance 1.00e-04)
Warning: some integer variables take values larger than the maximum
supported value (2000000000)
Best objective 4.233333333333e+01, best bound 4.233333333333e+01, gap 0.0000%
In addition to the above, I also received this warning:
"Optimal solution found (tolerance 1.00e-04)
Warning: some integer variables take values larger than the maximum
supported value (2000000000)". What does it mean? Thank you so much!

It looks like you are encountering some numerical troubles. The root relaxation required an increased Markowitz tolerance, which indicates an ill-conditioned matrix. This may lead to inconsistencies as you have observed in the two different "optimal" solutions.
The warning about too large values means that there are integer variables with solution values so large that the integer feasibility tolerance can not reliably be checked anymore. If you have a variable with solution value in the range of 1e+9 it probably doesn't matter anymore whether they are integer or not. So you could probably also simplify your model by making them continuous variables.
You should check for violations in the two solutions for both models (see here) to see how feasible the solutions actually are.

SQL JOIN with 2 aggregates returning incorrect results

I am trying to join 3 different tables to get how many Home Runs a player has in his career along with how many Awards they have recieved. However, I'm getting incorrect results:
Peoples
PlayerId
Battings
PlayerId, HomeRuns
AwardsPlayers
PlayerId, AwardName
Current Attempt
SELECT TOP 25 Peoples.PlayerId, SUM(Battings.HomeRuns) as HomeRuns, COUNT(AwardsPlayers.PlayerId)
FROM Peoples
JOIN Battings ON Battings.PlayerId = Peoples.PlayerId
JOIN AwardsPlayers ON AwardsPlayers.PlayerId = Battings.PlayerId
GROUP BY Peoples.PlayerId
ORDER BY SUM(HomeRuns) desc
Result
PlayerID HomeRuns AwardCount
bondsba01 35814 1034
ruthba01 23562 726
rodrial01 21576 682
mayswi01 21120 736
willite01 20319 741
griffke02 18270 667
schmimi01 18084 594
musiast01 16150 748
pujolal01 14559 414
dimagjo01 12996 468
ripkeca01 12499 609
gehrilo01 12325 425
aaronha01 12080 368
foxxji01 11748 462
ramirma02 10545 399
benchjo01 10114 442
sosasa01 9744 304
ortizda01 9738 360
piazzmi01 9394 396
winfida01 9300 460
rodriiv01 9019 667
robinfr02 8790 330
dawsoan01 8760 420
robinbr01 8576 736
hornsro01 8127 648
I am pretty confident it's my second join Do I need to do some sort of subquery or should this work? Barry Bonds definitely does not have 35,814 Home Runs nor does he have 1,034 Awards
If I just do a single join, I get the correct output:
SELECT TOP 25 Peoples.PlayerId, SUM(Battings.HomeRuns) as HomeRuns
FROM Peoples
JOIN Battings ON Battings.PlayerId = Peoples.PlayerId
GROUP BY Peoples.PlayerId
ORDER BY SUM(HomeRuns) desc
bondsba01 762
aaronha01 755
ruthba01 714
rodrial01 696
mayswi01 660
pujolal01 633
griffke02 630
thomeji01 612
sosasa01 609
robinfr02 586
mcgwima01 583
killeha01 573
palmera01 569
jacksre01 563
ramirma02 555
schmimi01 548
ortizda01 541
mantlmi01 536
foxxji01 534
mccovwi01 521
thomafr04 521
willite01 521
bankser01 512
matheed01 512
ottme01 511
What am I doing wrong? I'm sure it's how I'm joining my second table (AwardsPlayers)

I think you have two independent dimensions. The best approach is to aggregate before joining:
SELECT TOP 25 p.PlayerId, b.HomeRuns, ap.cnt
FROM Peoples p LEFT JOIN
(SELECT b.PlayerId, SUM(b.HomeRuns) as HomeRuns
FROM Battings b
GROUP BY b.PlayerId
) b
ON b.PlayerId = p.PlayerId LEFT JOIN
(SELECT ap.PlayerId, COUNT(*) as cnt
FROM AwardsPlayers ap
GROUP BY ap.PlayerId
) ap
ON ap.PlayerId = p.PlayerId
ORDER BY b.HomeRuns desc;
Result
bondsba01 762 47
aaronha01 755 16
ruthba01 714 33
rodrial01 696 31
mayswi01 660 32
pujolal01 633 23
griffke02 630 29
thomeji01 612 6
sosasa01 609 16
robinfr02 586 15
mcgwima01 583 9
killeha01 573 8
palmera01 569 8
jacksre01 563 13
ramirma02 555 19
schmimi01 548 33
ortizda01 541 18
mantlmi01 536 15
foxxji01 534 22
mccovwi01 521 10
thomafr04 521 10
willite01 521 39
bankser01 512 10
matheed01 512 4
ottme01 511 11

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Loss nan problem when using TFBertForSequenceClassification - tensorflow

Why do you have from_logits = False? The classifier head returns logits so unless you put a softmax activation within your model, you need to calculate the loss from logits.

Related

How to solve an error, "module 'numpy' has no attribute 'float'"?

Removing duplicates based on matching column values with boolean indexing

How to plot using timstamp and coordinates?

Gurobi Warning and Inconsistency in Optimal Value [some integer variables take values larger than the maximum supported value (2000000000)]

SQL JOIN with 2 aggregates returning incorrect results

Categories

Resources