ax=b solving with numpy llinalg gives "Last 2 dimensions of the array must be square" - numpy

I have 3 matrices of shape (157, 4) , (157,) and (4,) respectively. The first matrix (an_array) contains the quantity of the items. the second, (Items matrix) contains the items names and the 3rd (Price matrix) one contains the price. I am trying to workout the price of individual items but I get an error saying "Last 2 dimensions of the array must be square" everytime I am trying to do
X = np.linalg.solve(an_array,price)
I am trying to solve ax=b but unfortunately its not working out. Any help would be very much appreciated.
**Price matrix**
array([499.25, 381. , 59.5 , 290. , 128.5 , 305.25, 336.25, 268.5 ,
395.25, 136.5 , 194.5 , 498.75, 62.25, 312.75, 332. , 171. ,
402.5 , 144.5 , 261.5 , 242.75, 381. , 371. , 355.5 , 373. ,
65.5 , 228.75, 208.75, 204.5 , 86.5 , 143. , 70.5 , 36.5 ,
82. , 302.5 , 365.5 , 158.5 , 316.5 , 508. , 86.5 , 359.75,
25.5 , 345.5 , 304.5 , 491.25, 181.5 , 343.75, 383.5 , 283.5 ,
140.25, 426. , 386. , 337.25, 415.5 , 268.25, 406. , 149.5 ,
200. , 122. , 510.25, 280. , 406.75, 191.25, 198. , 114.5 ,
211.5 , 241.75, 195.75, 276.25, 276. , 165.25, 102. , 425. ,
195. , 132.25, 86.75, 446.5 , 318. , 290.75, 286. , 232. ,
520.5 , 382.75, 94. , 482.75, 233.25, 262. , 368.25, 438.75,
433.5 , 334.5 , 360. , 422. , 191. , 292.25, 151.75, 440.25,
370. , 105.25, 122. , 455.5 , 363. , 436. , 147.5 , 548.5 ,
365.75, 185.5 , 348. , 342.5 , 509.25, 465.5 , 380.25, 361. ,
271.25, 414.25, 366.75, 145.5 , 348. , 471.25, 254.5 , 329. ,
441. , 253.25, 448.5 , 142. , 312.5 , 350. , 94. , 333. ,
418. , 194.5 , 543. , 212.5 , 66.5 , 370. , 423. , 164. ,
393.25, 299.75, 529.5 , 166.25, 228.5 , 476. , 373. , 383.25,
409. , 241. , 107.75, 194.5 , 350. , 221.75, 633.25, 444.25,
155.25, 76. , 542. , 346. , 159.75])
**Item matrix**
array(['Cereal', 'Coffee', 'Eggs', 'Pancake'], dtype='<U7')
#an_array matrix
array([[ 7, 6, 6, 9],
[ 6, 7, 0, 10],
[ 0, 7, 1, 0],
[ 4, 0, 10, 0],
[ 0, 0, 4, 2],
[10, 7, 0, 3],
[ 9, 1, 4, 3],
[ 9, 8, 0, 2],
[10, 0, 4, 5],
[ 6, 3, 0, 0],
[ 0, 1, 9, 0],
[ 3, 9, 9, 9],
[ 2, 0, 0, 1],
[ 6, 0, 6, 3],
[ 9, 0, 3, 4],
[ 8, 2, 0, 0],
[ 9, 0, 0, 10],
[ 5, 0, 0, 2],
[ 8, 7, 3, 0],
[ 0, 1, 6, 5],
[ 7, 0, 3, 8],
[ 0, 5, 10, 6],
[ 7, 1, 10, 0],
[ 4, 7, 10, 2],
[ 0, 0, 1, 2],
[ 2, 6, 0, 7],
[ 6, 4, 0, 3],
[ 8, 0, 0, 2],
[ 0, 0, 2, 2],
[ 3, 7, 0, 2],
[ 0, 9, 1, 0],
[ 1, 3, 0, 0],
[ 3, 4, 0, 0],
[ 4, 0, 0, 10],
[ 4, 9, 7, 4],
[ 6, 7, 0, 0],
[ 6, 9, 7, 0],
[ 6, 0, 10, 8],
[ 0, 0, 2, 2],
[ 6, 0, 4, 7],
[ 1, 1, 0, 0],
[ 3, 9, 7, 4],
[ 0, 1, 10, 4],
[ 6, 1, 10, 7],
[ 0, 2, 6, 2],
[ 4, 1, 7, 5],
[ 0, 3, 9, 8],
[ 3, 2, 8, 2],
[ 0, 10, 3, 1],
[ 4, 0, 8, 8],
[ 2, 0, 8, 8],
[ 7, 8, 2, 5],
[ 7, 2, 2, 10],
[ 0, 9, 3, 7],
[ 4, 4, 6, 8],
[ 5, 9, 0, 0],
[ 0, 2, 9, 0],
[ 4, 0, 2, 0],
[10, 9, 5, 7],
[ 4, 4, 0, 8],
[ 9, 10, 5, 3],
[ 4, 0, 0, 5],
[ 1, 0, 0, 8],
[ 1, 1, 0, 4],
[ 4, 1, 6, 0],
[ 0, 8, 2, 7],
[ 1, 5, 6, 1],
[ 4, 4, 3, 5],
[ 9, 6, 3, 0],
[ 2, 3, 2, 3],
[ 4, 4, 0, 0],
[ 8, 10, 10, 0],
[ 6, 6, 2, 0],
[ 0, 0, 1, 5],
[ 1, 0, 0, 3],
[ 7, 0, 4, 10],
[ 4, 8, 5, 4],
[ 9, 8, 0, 3],
[ 4, 6, 4, 4],
[10, 2, 1, 0],
[10, 3, 6, 8],
[ 6, 8, 3, 7],
[ 0, 9, 0, 2],
[ 8, 7, 4, 9],
[ 0, 6, 0, 9],
[ 0, 0, 4, 8],
[ 0, 0, 8, 9],
[ 8, 8, 8, 3],
[ 3, 5, 8, 8],
[ 7, 3, 0, 8],
[ 9, 6, 7, 0],
[ 8, 0, 4, 8],
[ 9, 2, 0, 0],
[ 6, 3, 0, 7],
[ 0, 4, 3, 3],
[10, 1, 8, 3],
[ 5, 10, 6, 4],
[ 0, 7, 0, 3],
[ 4, 0, 2, 0],
[10, 6, 0, 10],
[ 4, 0, 5, 8],
[10, 0, 7, 4],
[ 1, 7, 0, 4],
[10, 0, 6, 10],
[ 8, 10, 4, 3],
[ 9, 1, 0, 0],
[ 9, 0, 8, 0],
[ 6, 0, 0, 10],
[ 9, 1, 8, 7],
[ 1, 10, 8, 10],
[ 0, 6, 7, 9],
[10, 5, 0, 6],
[ 0, 10, 5, 5],
[ 8, 6, 1, 9],
[ 0, 4, 9, 7],
[ 4, 0, 1, 2],
[ 3, 9, 5, 6],
[10, 10, 5, 5],
[ 5, 0, 1, 6],
[ 9, 8, 5, 0],
[ 8, 3, 2, 10],
[ 0, 2, 2, 9],
[ 5, 9, 10, 4],
[ 6, 4, 0, 0],
[ 8, 1, 7, 0],
[ 6, 7, 7, 2],
[ 0, 9, 0, 2],
[10, 8, 0, 4],
[10, 1, 8, 2],
[ 0, 3, 0, 8],
[10, 8, 10, 4],
[ 0, 0, 8, 2],
[ 0, 4, 0, 2],
[ 8, 0, 10, 0],
[ 7, 0, 5, 8],
[ 6, 8, 0, 0],
[ 8, 6, 0, 9],
[10, 6, 0, 3],
[ 9, 4, 5, 10],
[ 0, 10, 0, 5],
[ 6, 4, 2, 2],
[ 9, 10, 3, 8],
[ 8, 3, 3, 6],
[ 6, 0, 3, 9],
[ 3, 1, 10, 6],
[ 0, 0, 3, 8],
[ 4, 1, 0, 1],
[ 5, 1, 0, 4],
[ 7, 0, 10, 0],
[ 5, 10, 0, 3],
[10, 8, 9, 9],
[10, 6, 9, 1],
[ 0, 8, 0, 5],
[ 0, 10, 1, 0],
[ 8, 7, 10, 6],
[ 0, 0, 8, 8],
[ 0, 5, 1, 5]])

you have 4 variables that are present in 157 equations, and you need to solve for the 4 variables, this system has no solution and is not solved by a linear system solution, rather as a least squares solution using np.linalg.lstsq
X = np.linalg.lstsq(an_array, price)[0]
print(X)

Using the lstsqr as recommended by #Ahmed:
In [88]: X = np.linalg.lstsq(quant, price)[0]
C:\Users\paul\AppData\Local\Temp\ipykernel_6428\995094510.py:1: FutureWarning: `rcond` parameter will change to the default of machine precision times ``max(M, N)`` where M and N are the input matrix dimensions.
To use the future default and silence this warning we advise to pass `rcond=None`, to keep using the old, explicitly pass `rcond=-1`.
X = np.linalg.lstsq(quant, price)[0]
In [90]: X
Out[90]: array([20. , 5.5 , 21. , 22.25])
Multiplying the quantities by these prices and summing gives the same numbers as your price array:
In [91]: quant#X
Out[91]:
array([499.25, 381. , 59.5 , 290. , 128.5 , 305.25, 336.25, 268.5 ,
395.25, 136.5 , 194.5 , 498.75, 62.25, 312.75, 332. , 171. ,
... , 159.75])
So there's no noise in these prices - the values are exact. That means you could use solve with any 4 of the price and quant values. That addresses the "must be square" error. quant[:4,:] has (4,4) shape.
In [92]: np.linalg.solve(quant[:4,:], price[:4])
Out[92]: array([20. , 5.5 , 21. , 22.25])

Related

Numpy: Fast approach to extract rows from an array using an 2D index array

I have 2 arrays a and b:
N,D,V,W = 2,3,4,5
a = np.random.randint(0,V,N*D).reshape(N,D)
a
array([[2, 3, 3],
[2, 0, 3]])
b = np.random.randint(0,10,V*W).reshape(V,W)
b
array([[0, 1, 0, 5, 5],
[0, 3, 6, 8, 7],
[8, 8, 9, 0, 9],
[4, 6, 3, 3, 1]])
What I need to do is to replace every element of array a with a row from array b using the array a element value as the row index of array b.
At the moment I'm doing it this way which works fine:
b[a.ravel(),:].reshape(*a.shape,-1)
array([[[8, 8, 9, 0, 9],
[4, 6, 3, 3, 1],
[4, 6, 3, 3, 1]],
[[8, 8, 9, 0, 9],
[0, 1, 0, 5, 5],
[4, 6, 3, 3, 1]]])
However it seems this approach is a bit slow.
I tested it with:
N,D,V,W = 20000,64,100,256
and it took an average of 674ms on my laptop(8 core, 16 ram)
Can someone please recommend an faster yet still simple approach?

Keen-dataviz: Uncaught Requested parser does not exist

I keep getting the following error: "Uncaught Requested parser does not exist" from the devtool console
and no chart is displayed on my dashboard.
JS Code:
var chart = new Keen.Dataviz()
.el('#chart-01')
.height(280)
.title('Registered CSRs')
.type('bar')
.prepare();
// Fetch data from my server's API
var json = $.ajax({
url: "/api/v1.0/registered_csrs/nwg",
dataType: "json",
success: function (jsonData) {
chart
.data(jsonData)
.render();
}
});
Here is the format of jsonData
$ curl -GET http://localhost:4000/api/v1.0/registered_csrs/nwg
[["Date", "Total", "Emergency", "High", "Medium", "Low"],
["2016-02-01", 1, 0, 0, 1, 0],
["2016-03-01", 6, 0, 0, 6, 0],
["2016-04-01", 11, 0, 1, 7, 3],
["2016-05-01", 19, 0, 1, 16, 2],
["2016-06-01", 27, 0, 12, 13, 2],
["2016-07-01", 27, 3, 12, 12, 0],
["2016-08-01", 25, 3, 8, 11, 3],
["2016-09-01", 21, 4, 10, 5, 2],
["2016-10-01", 19, 3, 4, 11, 1],
["2016-11-01", 29, 4, 12, 12, 1],
["2016-12-01", 26, 2, 9, 14, 1],
["2017-01-01", 16, 1, 3, 11, 1],
["2017-02-01", 22, 2, 8, 11, 1],
["2017-03-01", 28, 2, 10, 14, 2],
["2017-04-01", 15, 2, 6, 5, 2],
["2017-05-01", 28, 2, 7, 18, 1],
["2017-06-01", 22, 1, 11, 8, 2],
["2017-07-01", 10, 1, 4, 5, 0]]
Take a look
Keen.io Dataviz to draw graph but keep getting error "Uncaught Requested parser does not exist"
https://keen.io/docs/visualize/visualize-your-own-data/
Code below should work fine in your app
chart
.data({result: jsonData})
.render();

Splitting a number and assigning to elements in a row in a numpy array

How to place a list of numbers in to a 2D numpy array, where the second dimension of the array is equal to the number of digits of the largest number of that list? I also want the elements that don't belong to the original number to be zero in each row of the returning array.
Example:
From the list a = range(0,1001), how to get the numpy array of the below form:
[[0,0,0,0],
[0,0,0,1],
[0,0,0,2],
...
[0,9,9,8]
[0,9,9,9],
[1,0,0,0]]
Please note how the each number is placed in-place in a np.zeros((1000,4)) array at the end of the each row.
NB: A pythonic, vectorized implementation is expected
Broadcasting again!
def split_digits(a):
N = int(np.log10(np.max(a))+1) # No. of digits
r = 10**np.arange(N,-1,-1) # 10-powered range array
return (np.asarray(a)[:,None]%r[:-1])//r[1:]
Sample runs -
In [224]: a = range(0,1001)
In [225]: split_digits(a)
Out[225]:
array([[0, 0, 0, 0],
[0, 0, 0, 1],
[0, 0, 0, 2],
...,
[0, 9, 9, 8],
[0, 9, 9, 9],
[1, 0, 0, 0]])
In [229]: a = np.random.randint(0,1000000,(7))
In [230]: a
Out[230]: array([431921, 871855, 636144, 541186, 410562, 89356, 476258])
In [231]: split_digits(a)
Out[231]:
array([[4, 3, 1, 9, 2, 1],
[8, 7, 1, 8, 5, 5],
[6, 3, 6, 1, 4, 4],
[5, 4, 1, 1, 8, 6],
[4, 1, 0, 5, 6, 2],
[0, 8, 9, 3, 5, 6],
[4, 7, 6, 2, 5, 8]])
Another concept using pandas str
def pir(a):
z = int(np.log10(np.max(a)))
s = pd.Series(a.astype(str))
zfilled = s.str.zfill(z + 1).sum()
a_ = np.array(list(zfilled)).reshape(-1, z + 1)
return a_.astype(int)
Using #Divakar's random array
a = np.random.randint(0,1000000,(7))
array([ 57190, 29950, 392317, 592062, 460333, 639794, 983647])
pir(a)
array([[0, 5, 7, 1, 9, 0],
[0, 2, 9, 9, 5, 0],
[3, 9, 2, 3, 1, 7],
[5, 9, 2, 0, 6, 2],
[4, 6, 0, 3, 3, 3],
[6, 3, 9, 7, 9, 4],
[9, 8, 3, 6, 4, 7]])

Convert string to integer pandas dataframe index

I have a pandas dataframe with a multiindex. Unfortunately one of the indices gives years as a string
e.g. '2010', '2011'
how do I convert these to integers?
More concretely
MultiIndex(levels=[[u'2010', u'2011'], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]],
labels=[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12, , ...]], names=[u'Year', u'Month'])
.
df_cbs_prelim_total.index.set_levels(df_cbs_prelim_total.index.get_level_values(0).astype('int'))
seems to do it, but not inplace. Any proper way of changing them?
Cheers,
Mike
Will probably be cleaner to do this before you assign it as index (as #EdChum points out), but when you already have it as index, you can indeed use set_levels to alter one of the labels of a level of your multi-index. A bit cleaner as your code (you can use index.levels[..]):
In [165]: idx = pd.MultiIndex.from_product([[1,2,3], ['2011','2012','2013']])
In [166]: idx
Out[166]:
MultiIndex(levels=[[1, 2, 3], [u'2011', u'2012', u'2013']],
labels=[[0, 0, 0, 1, 1, 1, 2, 2, 2], [0, 1, 2, 0, 1, 2, 0, 1, 2]])
In [167]: idx.levels[1]
Out[167]: Index([u'2011', u'2012', u'2013'], dtype='object')
In [168]: idx = idx.set_levels(idx.levels[1].astype(int), level=1)
In [169]: idx
Out[169]:
MultiIndex(levels=[[1, 2, 3], [2011, 2012, 2013]],
labels=[[0, 0, 0, 1, 1, 1, 2, 2, 2], [0, 1, 2, 0, 1, 2, 0, 1, 2]])
You have to reassign it to save the changes (as is done above, in your case this would be df_cbs_prelim_total.index = df_cbs_prelim_total.index.set_levels(...))

Select elements of a numpy array based on the elements of a second array

Consider a numpy array A of shape (7,6)
A = array([[0, 1, 2, 3, 5, 8],
[4, 100, 6, 7, 8, 7],
[8, 9, 10, 11, 5, 4],
[12, 13, 14, 15, 1, 2],
[1, 3, 5, 6, 4, 8],
[12, 23, 12, 24, 4, 3],
[1, 3, 5, 7, 89, 0]])
together with a second numpy array r of the same shape which contains the radius of A starting from a central point A(3,2)=0:
r = array([[3, 3, 3, 3, 3, 4],
[2, 2, 2, 2, 2, 3],
[2, 1, 1, 1, 2, 3],
[2, 1, 0, 1, 2, 3],
[2, 1, 1, 1, 2, 3],
[2, 2, 2, 2, 2, 3],
[3, 3, 3, 3, 3, 4]])
I would like to pick up all the elements of A which are located at the position 1 of r, i.e. [9,10,11,15,4,6,5,13], all the elements of A located at position 2 of r and so on. I there some numpy function to do that?
Thank you
You can select a section of A by doing something like A[r == 1], to get all the sections as a list you could do [A[r == i] for i in range(r.max() + 1)]. This will work, but may be inefficient depending on how big the values in r go because you need to compute r == i for every i.
You could also use this trick, first sort A based on r, then simply split the sorted A array at the right places. That looks something like this:
r_flat = r.ravel()
order = r_flat.argsort()
A_sorted = A.ravel()[order]
r_sorted = r_flat[order]
edges = r_sorted.searchsorted(np.arange(r_sorted[-1] + 1), 'right')
sections = []
start = 0
for end in edges:
sections.append(A_sorted[start:end])
start = end
I get a different answer to the one you were expecting (3 not 4 from the 4th row) and the order is slightly different (strictly row then column), but:
>>> A
array([[ 0, 1, 2, 3, 5, 8],
[ 4, 100, 6, 7, 8, 7],
[ 8, 9, 10, 11, 5, 4],
[ 12, 13, 14, 15, 1, 2],
[ 1, 3, 5, 6, 4, 8],
[ 12, 23, 12, 24, 4, 3],
[ 1, 3, 5, 7, 89, 0]])
>>> r
array([[3, 3, 3, 3, 3, 4],
[2, 2, 2, 2, 2, 3],
[2, 1, 1, 1, 2, 3],
[2, 1, 0, 1, 2, 3],
[2, 1, 1, 1, 2, 3],
[2, 2, 2, 2, 2, 3],
[3, 3, 3, 3, 3, 4]])
>>> A[r==1]
array([ 9, 10, 11, 13, 15, 3, 5, 6])
Alternatively, you can get column then row ordering by transposing both arrays:
>>> A.T[r.T==1]
array([ 9, 13, 3, 10, 5, 11, 15, 6])