How to use cur.executemany() to store data from Twitter - sql

I am trying to download tweets from a list of three different accounts and then store all the informations in a SQL3 database.
I have tried with the code below, but it seems to run forever. Am I missing something? Is this because I used .executemany() instead of .execute()?
step=0
a_list=["A","B","C"]
for s in a_list:
cursor = tweepy.Cursor(api1.user_timeline, id = s, tweet_mode='extended').items(3189)
for tweet in cursor:
tw_text.append(tweet.full_text)
created_at.append(tweet.created_at)
rtws.append(tweet.retweet_count)
favs.append(tweet.favorite_count)
for h in tweet.entities['hashtags']:
hashlist.append(h['text'])
for u in tweet.entities['urls']:
linklist.append(u['expanded_url'])
try:
medialist.append(media['media_url'] for media in tweet.entities['media'])
except:
pass
step+=1
print('step {} completed'.format(step))
#preparing all the data for .executemany()
g = [(s,tw,crea,rt,fv,ha,li,me) for s in ['GameOfThrones'] for tw in tw_text for crea in created_at for rt in rtws for fv in favs for ha in hashlist for li in linklist for me in medialist]
cur.executemany("INSERT INTO series_data VALUES (?,?,?,?,?,?,?,?)", (g))
con.commit()
print('db updated')
I expect the program to write table in SQL3 but I never receive the message 'db updated' (i.e. the very last print() line)

cur.executemany() takes a list of tuples. Each tuple will have as many elements as number of columns you want to insert value for.
For example, if you have a table with following structure
create table tbl_test(firstname varchar(20), lastname varchar(20));
and you want to insert 3 records in it using executemany(), your object and the call should be like following
list = [('Hans', 'Muster'), ('John', 'Doe'), ('Jane', 'Doe')]
cur.executemany('insert into tbl_test values(?, ?)', list)

Related

ProgrammingError: ('Expected 0 parameters, supplied 391', 'HY000') with 391 columns using dynamic approach

I have a dataframe that contains 391 columns and a number of rows. I am trying to push this to a database via pyodbc and using the following command:
cursor = conn.cursor()
cursor.fast_executemany = True
cursor.executemany(
f"INSERT INTO db.tble({', '.join(df.columns.tolist())}) VALUES ({('?,' * len(df.columns))[:-1]})",
list(df.itertuples(index=False, name=None))
)
cursor.commit()
I would have thought this method would be dynamic for a dataframe of any size yet I get the following error:
ProgrammingError: ('Expected 0 parameters, supplied 391', 'HY000')
I am struggling to understand this as the syntax looks correct, ? has been used instead of %s like other answers. Can someone please help.
Thanks
I once wrote a piece of code, where I wanted to create the insert statement dynamically based on number of columns in the data frame:
here is how the insert query would be passed to the database:
INSERT INTO dbo.Table (column1,columns2,column3) VALUES (?,?,?)
and again, the number of columns and values '?' would be required to be created dynamically at runtime based upon the number of columns the data frame had
I wrote the below piece to just write a string (of ?,?,?) and concatenate it with the insert query,
here
df is the dataframe,
symbol_counter would hold the number of columns in the dataframe,
sym_string would be the final string i.e. (?,?,?,?...n) based on the number of columns
symbol = ['?']
sym_string = ''
symbol_counter = int(df.shape[1])-1
word = 0
for word in range(symbol_counter):
# sym_string += str(symbol)
symbol.insert(word, "?")
word+=1
sym_string = (','.join(symbol))
#and then use this variable and concatenate it with the rest of the query as shown below
query = Variable_holding_first_partofthequery + " VALUES (" +sym_string+")"
I know, it's the big way, but that's how I got it to work. Good Luck!

show query parameters that don't select anything

I have a table with a text column and I would like to select all rows that match the list of search parameters that were provided by the user:
select * from value where value.text in ('Mary', 'Steve', 'Walter');
In addition, I want to notify the user if any of his search terms could not be found. Let's say 'Steve' does not exist in the value.text column, how can I write a query that will show 'Steve'? As that information does not exist in any table, I have no idea how it could be done using a SQL query.
The actual Hibernate code looks like this:
List<String> searchItemList = new ArrayList<>();
searchItemList.add("Mary");
searchItemList.add("Steve");
searchItemList.add("Walter");
Query query = em.createQuery("select v from Value as v where v.text in ( :searchitemlist )");
query.setParameter("searchitemlist", searchItemList);
List result = query.getResultList();
log.info("{}", result.size());
log.info("{}", result);
The searchItemList is a list of all search terms provided by the user. Can be a few hundreds lines long. The current workaround is to search the value table once for each searchItem and note all queries that return 0 rows. That is rather inefficient, surely there is a better approach? Please advise.
You can use the following query to get an array of search items that exist in the database
SELECT DISTINCT value.text from value where value.text in ('Mary', 'Steve', 'Walter');
after running this query, If we assume that the answer is stored in an array called result, notExistSearchListItems will give you the final result
IEnumerable<string> notExistSearchListItems = searchItemList.Except(result);

Postgres: on conflict, summing two vectrors(arrays)

I'm trying to handle an array of counters column in Postgres
for example, let's say I have this table
name
counters
Joe
[1,3,1,0]
and now I'm adding 2 values ("Ben", [1,3,1,0]) and ("Joe",[2,0,2,1])
I expect the query to sum between the 2 counters vectors on conflict ([1,3,1,0] + [2,0,2,1] = [3,3,3,1])
the expected result:
name
counters
Joe
[3,3,3,1]
Ben
[1,3,1,0]
I tried this query
insert into test (name, counters)
values ("Joe",[2,0,2,1])
on conflict (name)
do update set
counters = array_agg(unnest(test.counters) + unnest([2,0,2,1]))
but it didn't seem to work, what am I missing?
There are two problems with the expression:
array_agg(unnest(test.counters) + unnest([2,0,2,1]))
there is no + operator for arrays,
you cannot use set-valued expressions as an argument in an aggregate function.
You need to unnest both arrays in a single unnest() call placed in the from clause:
insert into test (name, counters)
values ('Joe', array[2,0,2,1])
on conflict (name) do
update set
counters = (
select array_agg(e1 + e2)
from unnest(test.counters, excluded.counters) as u(e1, e2)
)
Also pay attention to the correct data syntax in values and the use of a special record excluded (find the relevant information in the documentation.)
Test it in db<>fiddle.
Based on your reply to my comments that it will always be four elements in the array and the update is being done by a program of some type, I would suggest something like this:
insert into test (name, counters)
values (:NAME, :COUNTERS)
on conflict (name) do
update set
counters[1] = counters[1] + :COUNTERS[1],
counters[2] = counters[2] + :COUNTERS[2],
counters[3] = counters[3] + :COUNTERS[3],
counters[4] = counters[4] + :COUNTERS[4]

Inserting Python list into SQL db

I have a python list containing last_name, first_name, title
I would like to insert this list into sql db, but the data should be inserted in 3 columns and many rows...
I tried following:
data = updated_values[7:-1] #list has around 200 names
query = "INSERT INTO names(last_name, first_name, titles) VALUES(%s,%s,%s)"
cursor.executemany(query,data)
but it throws error...
can anyone help me in this?
Thank you guys for all your help, I was able to find the solution and here it is:
it = iter(updated_values[7:-1])
updated_values_1 = list(zip(it, it, it))
query = "INSERT INTO names(last_name, first_name, titles) VALUES(%s,%s,%s)"
cursor.executemany(query,updated_values_1)
new_data = []
first_names = data[::3]
last_names = data[1::3]
titles = data[2::3]
x = 0
for first_name in first_names:
new_data.append( (first_names, last_names[x], titles[x] ))
x += 1
query = "INSERT INTO names(last_name, first_name, titles) VALUES(%s,%s,%s)"
cursor.executemany(query,new_data)
Basically separate your data then loop through one of the arrays and build a new arrray. This is untested so I'm sure you'll have to modify something.

I want to sql script input string and value in database is number?

I want to sql script in select
input string and value in database is number?
EX.
Input: Wait Approve
ApproveStatus(nvarchar)
"0"
"2"
"4"...
ApproveStatus = 2 It's "Wait Approve"
I want sql script find value "Wait Approve" in ApproveStatus and Output is fields ApproveStatus = 2
This first in questions
Something like this?
Given the following data:
CREATE TABLE statuslookup (
id int unique,
label nvarchar unique
);
INSERT INTO statuslookup (id, label) values (2, 'Wait Approve');
Then if you need to convert from label to id:
SELECT id FROM statuslookup WHERE label = 'Wait Approve';
Hopefully that at least gets you started.....