How to create this non-regular "identity matrix" with numpy - numpy

How to create the following matrix, with an input parameter n?
n=2:
[[1, 0, 0, 0]
[0, 0, 0, 1]]
n=3:
[[1, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 1, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 0, 1]]
n=4:
[[1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]]

You can create an all zero array and set desired values to 1:
a = np.zeros((n,n*n), dtype=int)
a[np.arange(n),(n+1)*np.arange(n)] = 1
Another way is to create a larger I(n*n) matrix and select every n+1 rows from it:
a = np.eye(n*n, dtype=int)[::n+1]
output for n=4:
[[1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1]]
for n=3:
[[1 0 0 0 0 0 0 0 0]
[0 0 0 0 1 0 0 0 0]
[0 0 0 0 0 0 0 0 1]]
And n=2:
[[1 0 0 0]
[0 0 0 1]]

One liner:
np.bincount(np.arange(0,n*n*n,n*n+n+1)).reshape(n,n*n)
# array([[1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]])
With preallocation:
out = np.zeros((n,n*n),int)
out.ravel()[::n*n+n+1] = 1
or
out = np.zeros((n,n*n),int)
np.einsum("iii->i",out.reshape(n,n,n))[...] = 1

You can try this:
a = np.zeros(n**3, dtype = int)
a[range(0, n**3, math.ceil(n**3 / (n - 1)) - 1)] = 1
a = a.reshape(n, n**2)

Related

Change every n-th element of a row in a 2d numpy array depending on the row number

I have a 2d array:
H = 12
a = np.ones([H, H])
print(a.astype(int))
[[1 1 1 1 1 1 1 1 1 1 1 1]
[1 1 1 1 1 1 1 1 1 1 1 1]
[1 1 1 1 1 1 1 1 1 1 1 1]
[1 1 1 1 1 1 1 1 1 1 1 1]
[1 1 1 1 1 1 1 1 1 1 1 1]
[1 1 1 1 1 1 1 1 1 1 1 1]
[1 1 1 1 1 1 1 1 1 1 1 1]
[1 1 1 1 1 1 1 1 1 1 1 1]
[1 1 1 1 1 1 1 1 1 1 1 1]
[1 1 1 1 1 1 1 1 1 1 1 1]
[1 1 1 1 1 1 1 1 1 1 1 1]
[1 1 1 1 1 1 1 1 1 1 1 1]]
The goal is, for every row r to substitute every r+1-th (starting with 0th) element of that row with 0.
Namely, for the 0th row substitute every 'first' (i.e. all of them) element with 0. For the 1st row substitute every 2nd element with 0. And so on.
It can trivially be done in a loop (the printed array is the desired output):
for i in np.arange(H):
a[i, ::i+1] = 0
print(a.astype(int))
[[0 0 0 0 0 0 0 0 0 0 0 0]
[0 1 0 1 0 1 0 1 0 1 0 1]
[0 1 1 0 1 1 0 1 1 0 1 1]
[0 1 1 1 0 1 1 1 0 1 1 1]
[0 1 1 1 1 0 1 1 1 1 0 1]
[0 1 1 1 1 1 0 1 1 1 1 1]
[0 1 1 1 1 1 1 0 1 1 1 1]
[0 1 1 1 1 1 1 1 0 1 1 1]
[0 1 1 1 1 1 1 1 1 0 1 1]
[0 1 1 1 1 1 1 1 1 1 0 1]
[0 1 1 1 1 1 1 1 1 1 1 0]
[0 1 1 1 1 1 1 1 1 1 1 1]]
Can I make use the vectorisation power of numpy here and avoid looping? Or it is not possible?
You can use a np.arange and broadcast modulo over itself
import numpy as np
H = 12
a = np.arange(H)
((a % (a+1)[:, None]) != 0).astype('int')
Output
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1],
[0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1],
[0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1],
[0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1],
[0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1],
[0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1],
[0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1],
[0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1],
[0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1],
[0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0],
[0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])

How to collect more than 200 tweets for multiple twitter handles / users? [duplicate]

I am downloading more than twitters rate cap using a loop; however, when I try to append the list it returns an empty dataframe.
My function looks like:
IN:
import pandas as pd
import numpy as np
import tweepy
from datetime import timedelta
def get_tweets(handle):
batch_count_for_tweet_downloads = 200
try:
alltweets = []
tweets = api_twitter.user_timeline(screen_name=handle,
count=batch_count_for_tweet_downloads,
exclude_replies=True,
include_rts=False,
lang="en",
tweet_mode="extended")
# ---GET MORE THAN 200 TWEETS
alltweets.extend(tweets)
oldest = alltweets[-1].id - 1
oldest_datetime = pd.to_datetime(str(pd.to_datetime(oldest))[:-10]).strftime("%Y-%m-%d %H:%M:%S")
print(f"Getting Tweets For " + handle + ", After: " + oldest_datetime)
while len(tweets) > 0:
tweets = api_twitter.user_timeline(screen_name=handle, count=batch_count_for_tweet_downloads, max_id=oldest)
alltweets.extend(tweets)
oldest = alltweets[-1].id - 1
print("Count: " + f"...{len(alltweets)} " + handle + " Tweets Downloaded")
#---
df = pd.DataFrame(data=[tweets.user.screen_name for tweets in alltweets], columns=['Handle'])
df['Tweets'] = np.array([tweets.full_text for tweets in alltweets])
df['Date'] = np.array([tweets.created_at - timedelta(hours=4) for tweets in alltweets])
df['Len'] = np.array([len(tweets.full_text) for tweets in alltweets])
df['Like_count'] = np.array([tweets.favorite_count for tweets in alltweets])
df['RT_count'] = np.array([tweets.retweet_count for tweets in alltweets])
total_tweets.extend(alltweets)
print(handle + " Total Tweets Extracted: {}".format(len(alltweets)))
except:
pass
return df
As you can see I need some help merging the loop into the function.
What is the best way of doing this?
Thank you for your help in advance.
EDIT 1: (What my code looks like now)
IN:
import tweepy
import pandas as pd
import numpy as np
from datetime import timedelta
handles = ['#MrML16419203', '#d00tn00t']
consumerKey = 'x'
consumerSecret = 'x'
accessToken = 'x'
accessTokenSecret = 'x'
authenticate = tweepy.OAuthHandler(consumerKey, consumerSecret)
authenticate.set_access_token(accessToken, accessTokenSecret)
api_twitter = tweepy.API(authenticate, wait_on_rate_limit=True)
total_tweets = []
def get_tweets(handle):
batch_count_for_tweet_downloads = 200
try:
alltweets = []
tweets = api_twitter.user_timeline(screen_name=handle,
count=batch_count_for_tweet_downloads,
exclude_replies=True,
include_rts=False,
lang="en",
tweet_mode="extended")
alltweets.extend(tweets)
oldest = alltweets[-1].id - 1
oldest_datetime = pd.to_datetime(str(pd.to_datetime(oldest))[:-10]).strftime("%Y-%m-%d %H:%M:%S")
print(f"Getting Tweets For " + handle + ", After: " + oldest_datetime)
while len(tweets) > 0:
tweets = api_twitter.user_timeline(screen_name=handle, count=batch_count_for_tweet_downloads, max_id=oldest)
alltweets.extend(tweets)
if len(alltweets) > 0:
oldest = alltweets[-1].id - 1
else:
pass
print("Count: " + f"...{len(alltweets)} " + handle + " Tweets Downloaded")
print('---Total Downloaded: ' + str(len(alltweets)) + ' for ' + handle + '---')
df = pd.DataFrame(data=[tweets.user.screen_name for tweets in alltweets], columns=['Handle'])
df['Tweets'] = np.array([tweets.full_text for tweets in alltweets])
df['Date'] = np.array([tweets.created_at - timedelta(hours=4) for tweets in alltweets])
df['Len'] = np.array([len(tweets.full_text) for tweets in alltweets])
df['Like_count'] = np.array([tweets.favorite_count for tweets in alltweets])
df['RT_count'] = np.array([tweets.retweet_count for tweets in alltweets])
print([tweets.favorite_count for tweets in alltweets])
print(np.array([tweets.favorite_count for tweets in alltweets]))
total_tweets.extend(alltweets)
print("----------Total Tweets Extracted: {}".format(df.shape[0]) + "----------")
except:
pass
return df
df = pd.DataFrame()
for handle in handles:
df_new = get_tweets(handle)
df = pd.concat((df, df_new))
print(df)
OUT:
Getting Tweets For #MrML16419203, After: 2011-03-19 07:03:53
Count: ...136 #MrML16419203 Tweets Downloaded
---Total Downloaded: 136 for #MrML16419203---
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
----------Total Tweets Extracted: 136----------
Getting Tweets For #d00tn00t, After: 2009-11-27 19:18:58
Count: ...338 #d00tn00t Tweets Downloaded
Count: ...530 #d00tn00t Tweets Downloaded
Count: ...546 #d00tn00t Tweets Downloaded
Count: ...546 #d00tn00t Tweets Downloaded
---Total Downloaded: 546 for #d00tn00t---
Handle Tweets Date Len Like_count RT_count
0 MrML16419203 132716 2020-09-02 02:18:28 6.0 0.0 0.0
1 MrML16419203 432881 2020-09-02 02:04:23 6.0 0.0 0.0
2 MrML16419203 973625 2020-09-02 02:04:09 6.0 0.0 0.0
3 MrML16419203 1234567 2020-09-02 01:55:10 7.0 0.0 0.0
4 MrML16419203 225865 2020-09-02 01:27:11 6.0 0.0 0.0
.. ... ... ... ... ... ...
541 d00tn00t NaN NaT NaN NaN NaN
542 d00tn00t NaN NaT NaN NaN NaN
543 d00tn00t NaN NaT NaN NaN NaN
544 d00tn00t NaN NaT NaN NaN NaN
545 d00tn00t NaN NaT NaN NaN NaN
[682 rows x 6 columns]
As you can see for handles which have less than 200 tweets the dataframe gets populated. However, not for handles which contain more than 200 tweets.
For anyone that stumbles across this I got it to work:
def get_tweets(screen_name):
batch_count_for_tweet_downloads = 200
try:
alltweets = []
tweets = api_twitter.user_timeline(screen_name=screen_name,
count=batch_count_for_tweet_downloads,
exclude_replies=True,
include_rts=False,
lang="en")
alltweets.extend(tweets)
oldest = alltweets[-1].id - 1
oldest_datetime = pd.to_datetime(str(pd.to_datetime(oldest))[:-10]).strftime("%Y-%m-%d %H:%M:%S")
print(f"Getting Tweets For " + handle + ", After: " + oldest_datetime)
while len(tweets) > 0:
tweets = api_twitter.user_timeline(screen_name=screen_name, count=batch_count_for_tweet_downloads,
max_id=oldest)
alltweets.extend(tweets)
if len(alltweets) > 0:
oldest = alltweets[-1].id - 1
else:
pass
print("Count: " + f"...{len(alltweets)} " + handle + " Tweets Downloaded")
outtweets = [
[tweet.user.screen_name, tweet.text, tweet.created_at, len(tweet.text),
tweet.favorite_count, tweet.retweet_count] for tweet in alltweets]
df_tweet_function = pd.DataFrame(outtweets,
columns=['Handle', 'Tweets', 'Date', 'Len', 'Like_count', 'RT_count'])
print('----------Total Downloaded: ' + str(len(alltweets)) + ' for ' + handle + '----------')
except tweepy.error.TweepError:
pass
return df_tweet_function
df = pd.DataFrame()
if name == 'main':
for handle in handles:
get_tweets(handle)
df = df.append(get_tweets(handle))
print("---------------TOTAL TWEETS EXTRACTED: {}".format(df.shape[0]) + "---------------")

Create tensors where all elements up to a given index are 1s, the rest are 0s

I have a placeholder lengths = tf.placeholder(tf.int32, [10]). Each of the 10 values assigned to this placeholder are <= 25. I now want to create a 2-dimensional tensor, called masks, of shape [10, 25], where each of the 10 vectors of length 25 has the first n elements set to 1, and the rest set to 0 - with n being the corresponding value in lengths.
What is the easiest way to do this using TensorFlow's built in methods?
For example:
lengths = [4, 6, 7, ...]
-> masks = [[1, 1, 1, 1, 0, 0, 0, 0, ..., 0],
[1, 1, 1, 1, 1, 1, 0, 0, ..., 0],
[1, 1, 1, 1, 1, 1, 1, 0, ..., 0],
...
]
You can reshape lengths to a (10, 1) tensor, then compare it with another sequence/indices 0,1,2,3,...,25, which due to broadcasting will result in True if the indices are smaller then lengths, otherwise False; then you can cast the boolean result to 1 and 0:
lengths = tf.constant([4, 6, 7])
n_features = 25
​
import tensorflow as tf
​
masks = tf.cast(tf.range(n_features) < tf.reshape(lengths, (-1, 1)), tf.int8)
with tf.Session() as sess:
print(sess.run(masks))
#[[1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
# [1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
# [1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]]

Drawing letters in excel

Is there a plugin that helps generate letters (A-Z) in excel as seen below? Or can we write some sort of VBA script to do this?
Stackoverflow is not a code-for-me service. Anyhow, the task looked interesting, and I have decided to code something about it:
Option Explicit
Public Sub WriteLetterA()
Dim varLetterA(8) As Variant
Dim lngColCounter As Long
Dim lngRowCounter As Long
Dim blnReverse As Boolean
Dim rngCell As Range
blnReverse = True
varLetterA(0) = Array(1, 1, 1, 0, 0, 1, 1, 1)
varLetterA(1) = Array(1, 0, 0, 0, 0, 0, 0, 1)
varLetterA(2) = Array(1, 0, 0, 1, 1, 0, 0, 1)
varLetterA(3) = Array(1, 0, 0, 1, 1, 0, 0, 1)
varLetterA(4) = Array(0, 0, 0, 1, 1, 0, 0, 0)
varLetterA(5) = Array(0, 0, 0, 0, 0, 0, 0, 0)
varLetterA(6) = Array(0, 0, 0, 0, 0, 0, 0, 0)
varLetterA(7) = Array(0, 0, 1, 1, 1, 1, 0, 0)
varLetterA(8) = Array(0, 0, 1, 1, 1, 1, 0, 0)
Cells(1, 1).Select
For lngRowCounter = 0 To UBound(varLetterA)
For lngColCounter = 0 To UBound(varLetterA(lngRowCounter))
Set rngCell = Cells(lngRowCounter + 1, lngColCounter + 1)
If varLetterA(lngRowCounter)(lngColCounter) Then
rngCell.Interior.Color = IIf(blnReverse, vbBlack, vbWhite)
Else
rngCell.Interior.Color = IIf(blnReverse, vbWhite, vbBlack)
End If
Next lngColCounter
Next lngRowCounter
End Sub
' Points for improvement - varLetterA in a separate class
' Refer to the sheet, do not assume it
' Pass the first cell as a reference
This is what you get:
blnReverse = False
blnReverse = True
Take a look at the points for improvement - they can be useful, if you decide to build the rest of the alphabet. Good luck.

Sendmessage wm_paste textbox

I am trying to copy text from another window then use SendMessage to paste the text in a textbox. I have tried using:
textBox1.Paste()
and
textBox1.text = Clipboard.GetText()
but it seems these paste functions are called before the sendmessage api, hence why I want a sendmessage api to paste to the textbox so it goes in the event order needed.
SendMessage(1508866, WM_COPY, 0, 0)
SendMessage(textBox1.handle, WM_PASTE, 0, 0) ' Does not paste anything in textbox.
EDIT:
Here is my code. Note, the clipboard method fires BEFORE sendmessage.
AppActivate("Hyperspace")
SetCursorPos(2271,214) ' Request
mouse_event(MOUSEEVENTF_LEFTDOWN, 0, 0, 0, 0)
mouse_event(MOUSEEVENTF_LEFTUP, 0, 0, 0, 0)
SetCursorPos(2726,111) ' Properties
mouse_event(MOUSEEVENTF_LEFTDOWN, 0, 0, 0, 0)
mouse_event(MOUSEEVENTF_LEFTUP, 0, 0, 0, 0)
SetCursorPos(2681,792) ' Get EOW
mouse_event(MOUSEEVENTF_LEFTDOWN, 0, 0, 0, 0)
mouse_event(MOUSEEVENTF_LEFTUP, 0, 0, 0, 0)
SetCursorPos(2853,525) ' Highlight EOW
mouse_event(MOUSEEVENTF_LEFTDOWN, 0, 0, 0, 0)
mouse_event(MOUSEEVENTF_LEFTUP, 0, 0, 0, 0)
mouse_event(MOUSEEVENTF_LEFTDOWN, 0, 0, 0, 0)
mouse_event(MOUSEEVENTF_LEFTUP, 0, 0, 0, 0)
SendMessage(1508866, WM_COPY, 0, 0)
textBox2.Text = Clipboard.GetText()
SetCursorPos(2983,719) ' Close
mouse_event(MOUSEEVENTF_LEFTDOWN, 0, 0, 0, 0)
mouse_event(MOUSEEVENTF_LEFTUP, 0, 0, 0, 0)
SetCursorPos(2967,783) ' Accept
mouse_event(MOUSEEVENTF_LEFTDOWN, 0, 0, 0, 0)
mouse_event(MOUSEEVENTF_LEFTUP, 0, 0, 0, 0)
SendMessage() is synchronous. It does not return until the message has been processed by the receiving window:
SendMessage(1508866, WM_COPY, 0, 0)
TextBox1.Text = Clipboard.GetText()
But why are you involving the clipboard at all? If you have the HWND of the external window, you could just use WM_GETTEXT to retrieve its text and then assign it to the Text property of your TextBox. No clipboard needed.