TVM: How to represent int8 gemm with int32 output

TVM: How to represent int8 gemm with int32 output - tvm

def matmul(M, K, N, dtype):
A = te.placeholder((M, K), name="A", dtype=dtype)
B = te.placeholder((K, N), name="B", dtype=dtype)
k = te.reduce_axis((0, K), name="k")
matmul = te.compute(
(M, N),
lambda i, j: te.sum(A[i, k] * B[k, j], axis=k),
name="matmul",
attrs ={"layout_free_placeholders": [B]}, # enable automatic layout transform for tensor B
)
out = te.compute((M, N), lambda i, j: matmul[i, j] , name="out")
return [A, B, out]
The output type is also int8, result larger than int8 will be cut off during computation.
How to make out tensor become int32?

Related

apply my created function to a data frame

I have the following function to calculate the black scholes model,
where I paste the necessary data from S0, K , K , T , r
and market_price in the function manually.
I would like to apply this same function to a pandas data frame
There I have the values needed to perform the calculation
Data frame example
data = {'Name':['BOVAE115', 'BOVAE119', 'BBDCE251', 'BBDCE246'],
'Valor':[110.050003, 110.050003, 19.500000, 19.500000],
'Strike Value':[15.00, 19.00, 24.67, 25.19],
'Temp':[0.119048, 0.119048, 0.119048, 0.119048],
'Taxa':[11.65, 11.65, 11.65, 11.65],
'Market Price':[0.391968, 0.391968, 0.391968, 0.391968],
'Order':['c','c','c','c']
}
# Create DataFrame
df = pd.DataFrame(data)
df
How do I apply the created function to this list of values
Function
See the code
S0 = df['Valor']
K = df['Strike Value']
T = df['Temp']
r = df['Taxa']
market_price = df['Price']
flag = df['Order']
from py_vollib.black_scholes import black_scholes as bs
from py_vollib.black_scholes.greeks.analytical import vega
def implied_vol(S0, K, T, r, market_price, flag='c', tol=0.00001):
#"""Calculating the implied volatility of an European option
# S0: stock price
# K: strike price
# T: time to maturity
# r: risk-free rate
# market_price: option price in market
#"""
max_iter = 500 #max no. of iterations
vol_old = 0.3 #initial guess
for k in range(max_iter):
bs_price = bs(flag, S0, K, T, r, vol_old)
Cprime = vega(flag, S0, K, T, r, vol_old)*100
C = bs_price - market_price
vol_new = vol_old - C/Cprime
new_bs_price = bs(flag, S0, K, T, r, vol_new)
if (abs(vol_old-vol_new) < tol or abs(new_bs_price-market_price) < tol):
break
vol_old = vol_new
implied_vol = vol_new
return implied_vol
S0 = 14.73
K = 15.04
T = 20/252
r = 0.1165
market_price = 0.41
print(implied_vol(S0, K, T, r, market_price)*100)
I wanted to return the implied vol value in a data frame column
How can I apply this function in my dataframe

How to shuffle elements of Mutable list in Kotlin?

I wanted to create a MutableList of alphabets and then shuffle them and store it in another MutableList.
I used shuffle() function but it resulted in the original list being shuffled as well which I didn't wanted to happen as I will be using the original list to map it with new shuffled one.
fun main(){
val alphabets = ('A'..'Z').toMutableList()
var shuffAlp = alphabets
shuffAlp.shuffle()
println(alphabets)
println(shuffAlp)
}
So I had to create two mutable list and then shuffle one of them
val alphabets = ('A'..'Z').toMutableList()
var shuffAlp = ('A'..'Z').toMutableList()
shuffAlp.shuffle()
This might be a trivial question but is there any other way where I do not have to create two same list?

shuffle does shuffle into original list, shuffled do and return new list.
And same behavior is for sort & sorted, sortBy & sortedBy, reverse & asReversed:
fun main(){
val alphabets = ('A'..'Z').toMutableList()
val shuffAlp = alphabets.shuffled()
println(alphabets)
println(shuffAlp)
}
Result:
[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z]
[U, B, A, N, H, R, O, K, X, C, W, E, Q, P, J, Z, L, Y, S, M, I, D, V, F, G, T]

Numpy dot product with 3d array

I've got two arrays:
data of shape (2466, 2498, 9), where the dimensions are (asset, date, returns).
correlation_matrix of shape (2466, 2466) (with 0's on the diagonal)
I want to get the dot product that equates to the expected returns, which is the returns of each asset multiplied by the correlation_matrix. It should give a shape the same as data.
I've tried:
data.transpose([1, 2, 0]) # correlation_matrix
but this just hangs my PC (been going 10 minutes and counting).
I also tried:
np.einsum('ijk,lm->ijk', data, correlation_matrix)
but I'm less familiar with einsum, and this also hangs.
What am I doing wrong?

With your .transpose((1, 2, 0)) data, the correct form is:
"ijs,sk" # -> ijk
Since for a tensor A and B, we can write:
C_{ijk} = Σ_s A_{ijs} * B_{sk}
If you want to avoid transposing your data beforehand, you can just permute the indices:
"sij,sk" # -> ijk
To verify:
p, q, r = 2466, 2498, 9
a = np.random.randint(255, size=(p, q, r))
b = np.random.randint(255, size=(p, p))
c1 = a.transpose((1, 2, 0)) # b
c2 = np.einsum("sij,sk", a, b)
>>> np.all(c1 == c2)
True
The amount of multiplications needed to compute this for (p, q, r) shaped data is p * np.prod(c.shape) == p * (q * r * p) == p**2 * q * r. In your case, that is 136_716_549_192 multiplications. You also need approximately the same number of additions, so that gives us somewhere close to 270 billion operations. If you want more speed, you could consider using a GPU for your computations via cupy.
def with_np():
p, q, r = 2466, 2498, 9
a = np.random.randint(255, size=(p, q, r))
b = np.random.randint(255, size=(p, p))
c1 = a.transpose((1, 2, 0)) # b
c2 = np.einsum("sij,sk", a, b)
def with_cp():
p, q, r = 2466, 2498, 9
a = cp.random.randint(255, size=(p, q, r))
b = cp.random.randint(255, size=(p, p))
c1 = a.transpose((1, 2, 0)) # b
c2 = cp.einsum("sij,sk", a, b)
>>> timeit(with_np, number=1)
513.066
>>> timeit(with_cp, number=1)
0.197
That's a speedup of 2600, including memory allocation, initialization, and CPU/GPU copy times! (A more realistic benchmark would give an even larger speedup.)

There are different ways to do this product:
# as you already suggested:
data.transpose([1, 2, 0]) # correlation_matrix
# using einsum
np.einsum('ijk,il', data, correlation_matrix)
# using tensordot to explicitly specify the axes to sum over
np.tensordot(data, correlation_matrix, axes=(0,0))
All of them should give the same result. The timing for some small matrices was more or less the same for me. So your problem is the large amount of data, not an inefficient implementation.
A=np.arange(100*120*9).reshape((100, 120, 9))
B=np.arange(100**2).reshape((100,100))
timeit('A.transpose([1,2,0])#B', globals=globals(), number=100)
# 0.747475513999234
timeit("np.einsum('ijk,il', A, B)", globals=globals(), number=100)
# 0.4993825999990804
timeit('np.tensordot(A, B, axes=(0,0))', globals=globals(), number=100)
# 0.5872082839996438

Complex matrix multiplication with tensorflow-backend of Keras

Let matrix F1 has a shape of (a * h * w * m), matrix F2 has a shape of (a * h * w * n), and matrix G has a shape of (a * m * n).
I want to implement the following formula which calculates each factor of G from factors of F1 and F2, using tensorflow backend of Keras. However I am confused by various backend functions, especially K.dot() and K.batch_dot().
$$ G_{k, i, j} = \sum^h_{s=1} \sum^w_{t=1} \dfrac{F^1_{k, s, t, i} * F^2_{k, s, t, j}}{h * w} $$ i.e.:
(Image obtained by copying the above equation within $$ and pasting it to this site)
Is there any way to implement the above formula? Thank you in advance.

Using Tensorflow tf.einsum() (which you could wrap in a Lambda layer for Keras):
import tensorflow as tf
import numpy as np
a, h, w, m, n = 1, 2, 3, 4, 5
F1 = tf.random_uniform(shape=(a, h, w, m))
F2 = tf.random_uniform(shape=(a, h, w, n))
G = tf.einsum('ahwm,ahwn->amn', F1, F2) / (h * w)
with tf.Session() as sess:
f1, f2, g = sess.run([F1, F2, G])
# Manually computing G to check our operation, reproducing naively your equation:
g_check = np.zeros(shape=(a, m, n))
for k in range(a):
for i in range(m):
for j in range(n):
for s in range(h):
for t in range(w):
g_check[k, i, j] += f1[k,s,t,i] * f2[k,s,t,j] / (h * w)
# Checking for equality:
print(np.allclose(g, g_check))
# > True

Efficiently compute Knuth's up-arrow notation modulus

I'm already using memoization as a dictionary. Is there anything else I can do? I suspect the for loop might be able to be optimized. For reference I am computing knuth_arrow(2, 3, 9, 14**8)
memo = {}
def knuth_arrow(a, n, b, m):
if (a, n, b) in memo:
return memo[(a, n, b)]
if n == 0:
return (a*b) % m
if n == 1:
s = pow(a, b, m)
memo[(a, n, b)] = s
return s
if n > 1:
s = a
for i in range(b-1):
s = knuth_arrow(a, n-1, s, m)
memo[(a, n, b)] = s
return s

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

TVM: How to represent int8 gemm with int32 output - tvm

Related

apply my created function to a data frame

How to shuffle elements of Mutable list in Kotlin?

Numpy dot product with 3d array

Complex matrix multiplication with tensorflow-backend of Keras

Efficiently compute Knuth's up-arrow notation modulus

Categories

Resources