SQLite and Python are inserting unwanted linebreaks into database - sql

I'm writing a little program that inserts and retrieves movie information in a database. If I type the info into the .db file manually, the program retrieves it and displays it perfectly in my GUI's textboxes. However, if I try to insert data from those textboxes into my database, it adds a linebreak to the end of each field. When I retrieve it again, it has "\n" at the end. The retrieving works great, but somehow linebreaks are being included in my inserts. Here is my insert function:
def addRecord(self):
newTitle = self.text_title.get(1.0, 'end')
newSynopsis = self.text_summary.get(1.0, 'end')
newCast = self.text_cast.get(1.0, 'end')
newRuntime = self.text_runtime.get(1.0, 'end')
newRating = self.text_rating.get(1.0, 'end')
newTrailer = self.text_trailer.get(1.0, 'end')
newDates = self.text_date2d.get(1.0, 'end')
newImage = self.text_image.get(1.0, 'end')
c.execute("INSERT INTO Movies (Title, Synopsis, Actors, Runtime, Rating, Trailer, Dates, Image) VALUES ('{}','{}','{}','{}','{}','{}','{}','{}');"
.format(newTitle, newSynopsis, newCast, newRuntime, newRating, newTrailer, newDates, newImage))
conn.commit()

Tkinter automatically adds a newline as the last character in the text widget. The proper way to get only the contents of the text widget that you or the user inserted you should use "end-1c" ("end" minus one character) rather than "end" or END.
Also, to be pedantic, the first index should be a string, not a float. In the case of 1.0 it doesn't matter, but you should be in the habit of always using strings for the indexes in a text widget.
For example:
newTitle = self.text_title.get("1.0", 'end-1c')

Related

cursor.execute() not uploading rows to sqlite database

I built a scraper that saves to a .csv file and am now attempting to save rows from that .csv file to an sqlite3 database with an IF statement, but it's not working. I've tried formatting the values in a dozen different ways and am getting nowhere.
"Match" prints every time the IF statement is True, but the row doesn't get added to the sqlite database. Calling cur.fetchall()/one()/etc results in 'None' being returned.
db = sqlite3.connect(':memory:')
cur = db.cursor()
cur.execute("DROP TABLE IF EXISTS jobs_table")
cur.execute('''CREATE TABLE IF NOT EXISTS
jobs_table(id TEXT,
date TEXT,
company TEXT,
position TEXT,
tags TEXT,
description TEXT,
url TEXT)''')
skills = ('python')
for row in csv_data:
if skills in row.get('description').lower():
print('')
print('Match!')
cur.execute("INSERT INTO jobs_table(id,
date,
company,
position,
tags,
description,
url) VALUES(:id,
:epoch,
:date,
:company,
:position,
:tags,
:description,
:url)", row)
I assume the problem is in my cur.execute() function, but I can't figure out how else it should be run. Any takers?
If you are calling cur.fetchone() right after the cur.execute() is normal to get None (or [] for cur.fetchall()). You need to execute a query first to get the results, for example cur.execute("SELECT * FROM jobs_table").

SQL query returns null values when written to file in groovy

I am attempting to export data and write to a formatted file in groovy 2.1.6. The query returns a null value for an entire column included in the query.
null, 0000001,1434368,ACTIVE
null, 0000002,1354447,ACTIVE
null, 0000004,1358538,ACTIVE
Here is the code that I am using in Groovy to query and write the data to a file.
private void profilerSql() {
def today = new Date()
def formattedDate = today.format('yyyyMMdd')
String reportSql
reportSql = """
SELECT
col_1,
col_2,
col_3,
col_4
from my_table
"""
sql.execute(reportSql)
def filename = "My_Table_export_" + formattedDate + ".csv"
//Create the file Object
File outputFile = new File(filename);
//Write a blank line to it to create a new "empty" file
outputFile.write("");
// Iterate through the SQL recordset. Output settings are defined within the function.
sql.eachRow(reportSql) {
// Create each line, joining the columns with a comma.
def reportLine = [it.col_1, it.col_2, it.col_3, it.col_4].join(',')
// Write the line to the file. End with a new line char.
outputFile.append(reportLine + System.getProperty("line.separator"))
}
}
Perhaps relevant information, the column that returns null values was created as a sequence in Oracle 11g. If any one can provide some insight even into how Groovy interacts with different data types in Oracle databases I would be grateful.
I see a couple things questionable about the code but none of which are about getting a sequence column out of Oracle - but wouldn't really expect that to be much of a problem - since JDBC has been around for years and years.
Don't think you need the initial call to sql.execute(reportSql) - the execute returns a boolean rather than a resultset.
Shouldn't the first parm to the outputFile.append be reportLine and not lineFormat?
Hope this helps!

Error in programmatically copying data to SpFieldChoice field SharePoint 2010

I am new to sharepoint, I have a custom field type derived from SpFieldChoice , my field allows users to select multiple values, I have a requirement of replacing some old custom columns with the new column and copy the data in old column to the new column. the old column also allows the users to select multiple values by ticking checkboxes, I have the following code to copy the data to new field.
foreach (SPListItem item in list.Items)
{
if (item[oldField.Title] == null)
{
item[newFld.Title] = string.Empty;
item.Update();
}
else
{
string[] itemvalues = item[oldField.Title].ToString().Split(new string[] {";#"}, StringSplitOptions.None);
StringBuilder multiLookupValues = new StringBuilder();
multiLookupValues.Append(";#");
for (int cnt = 0; cnt < (itemvalues.Length) / 2; cnt++)
{
multiLookupValues.Append (itemvalues[(cnt * 2) + 1].ToString() + ";#");
}
item[newFld.Title] = multiLookupValues.ToString();
item.SystemUpdate(false) ;
}
}
This code works fine until the length of resulting stringbuilder is less than 255 charachters , but when this length is greater then 255 I get the following Exception.
Invalid choice Value. A choice field contains invalid data.Please check the value and try again.
Is there any other way of copying data to SpFiledChoice, How can I resolve this problem? please help me.
Do the update multiple times so that the string doesn't exceed - i.e. value +=. However, if the problem is that the value can't be longer that 255, you have to consider how you are doing the choices. If it is exceeding the length and updating the value multiple times doesn't work (and a Site Column will have the same limitations), you can do the next best thing:
1) Create a new list that will hold the choices
2) Change the destination field to be a lookup
3) Update accordingly for each item (picking up the ID from the lookup field)
There's no limit to this.
David Sterling
david_sterling#sterling-consulting.com
www.sterling-consulting.com

Struggling with SQL BCP (uniqidentifier, default values on columns...)

EDIT: My only pending issue is c) (True and False on file, bit on database, I can't change neither the file nor the database scheme, there's hundreds of terabytes I can't touch).
The system receives a file (hundreds of thousands of them, actually) with a certain format. Things are:
a) First type is a uniqidentifier (more on this later)
b) On the database, the table's first 4 values are generated by the database (they are related to dates), meaning that those 4 values are not found on the files (all the rest are -and are in order-, even if it's always their representation as text or they are empty)
c) Bit values are represented with a False/True on the file.
So, the issue for 1 is that in the text file I receive as input, the uniqidentifier is using brackets. When I tried to generate the file with the format nul options using the bcp command tool, it would make it a sqlchar with 37 characters (which makes no sense to me, since it would either be 36 or 38).
Row separator is "+++\r\n", column separator is "©®©".
How would I go about generating the format files? I've been stuck with this for some time, I never used bcp before and errors I've got don't really tell much ("Unexpected EOF encountered in BCP data-file")
Am I supposed to specify all the columns in the format file or just the ones I desire to read from the files I get?
Thanks!
NOTE: I can't provide the SQL schema since it's for the company I work for. But it's pretty much: smalldate, tinyint tinyint tinyint (this four are generated by the db), uniqidentifier, chars, chars, more varchars, some bits, more varchars, some nvarchar. ALL values, except for those generated by the db, accept null.
My current problem is with the skipping the first 4 columns.
http://msdn.microsoft.com/en-us/library/ms179250(v=SQL.105).aspx
I followed that guide but somehow it's not working. Here's the changes (I'm just hard-changing column names to keep privacy of the project, even if it sounds stupid)
This is the one generated with bcp (with format nul -c) -note I put it as link 'cause it's not that short-
http://pastebin.com/4UkpPp1n
The second one, which is supposed to do the same but ignoring the first 4 columns is in the next pastebin:
http://pastebin.com/Lqj6XSbW
Yet it is not working. The error is "Error = [Microsoft][SQL Native Client]The number of fields provided for bcp operation is less than the number of columns on the server.", which was supposed to be the purpose of all that.
Any help will be greatly appreciated.
I'd create a new table with a CHAR(38) for the GUID. Import your data into this staging table, then translate it with CAST(SUBSTRING(GUID, 2, 36) AS UNIQUEIDENTIFIER) to import the staging data into your permanent table. This approach also works well for dates in odd formats, numbers with currency symbols, or generally any kind of poorly-formatted input.
BCP format files are a little touchy, but fundamentally aren't too complicated. If that part continues to give you trouble, one option is to import the whole row as a single VARCHAR(1000) field, then split it up within SQL - if you're comfortable with SQL text processing that is.
Alternately, if you are familiar with some other programming language, like Perl or C#, you can create a script to pre-process your inputs into a more friendly form, like tab-delimited. If you're not familiar with some other programming language, I suggest you pick one and get started! SQL is a great language, but sometimes you need a different tool; it's not great for text processing.
If you're familiar with C#, here's my code to generate a format file. No one gets to make fun of my Whitestone indentation :P
private static string CreateFormatFile(string filePath, SqlConnection connection, string tableName, string[] sourceFields, string[] destFields, string fieldDelimiter, string fieldQuote)
{
string formatFilePath = filePath + ".fmt";
StreamWriter formatFile = null;
SqlDataReader data = null;
try
{
// Load the metadata for the destination table, so we can look up fields' ordinal positions
SqlCommand command = new SqlCommand("SELECT TOP 0 * FROM " + tableName, connection);
data = command.ExecuteReader(CommandBehavior.SchemaOnly);
DataTable schema = data.GetSchemaTable();
Dictionary<string, Tuple<int, int>> metadataByField = new Dictionary<string, Tuple<int, int>>();
foreach (DataRow row in schema.Rows)
{
string fieldName = (string)row["ColumnName"];
int ordinal = (int)row["ColumnOrdinal"] + 1;
int maxLength = (int)row["ColumnSize"];
metadataByField.Add(fieldName, new Tuple<int, int>(ordinal, maxLength));
}
// Begin the file, including its header rows
formatFile = File.CreateText(formatFilePath);
formatFile.WriteLine("10.0");
formatFile.WriteLine(sourceFields.Length);
// Certain strings need to be escaped to use them in a format file
string fieldQuoteEscaped = fieldQuote == "\"" ? "\\\"" : fieldQuote;
string fieldDelimiterEscaped = fieldDelimiter == "\t" ? "\\t" : fieldDelimiter;
// Write a row for each source field, defining its metadata and destination field
for (int i = 1; i <= sourceFields.Length; i++)
{
// Each line contains (separated by tabs): the line number, the source type, the prefix length, the field length, the delimiter, the destination field number, the destination field name, and the collation set
string prefixLen = i != 1 || fieldQuote == null ? "0" : fieldQuote.Length.ToString();
string fieldLen;
string delimiter = i < sourceFields.Length ? fieldQuoteEscaped + fieldDelimiterEscaped + fieldQuoteEscaped : fieldQuoteEscaped + #"\r\n";
string destOrdinal;
string destField = destFields[i - 1];
string collation;
if (destField == null)
{
// If a field is not being imported, use ordinal position zero and a placeholder name
destOrdinal = "0";
fieldLen = "32000";
destField = "DUMMY";
collation = "\"\"";
}
else
{
Tuple<int, int> metadata;
if (metadataByField.TryGetValue(destField, out metadata) == false) throw new ApplicationException("Could not find field \"" + destField + "\" in table \"" + tableName + "\".");
destOrdinal = metadata.Item1.ToString();
fieldLen = metadata.Item2.ToString();
collation = "SQL_Latin1_General_CP1_CI_AS";
}
string line = String.Join("\t", i, "SQLCHAR", prefixLen, fieldLen, '"' + delimiter + '"', destOrdinal, destField, collation);
formatFile.WriteLine(line);
}
return formatFilePath;
}
finally
{
if (data != null) data.Close();
if (formatFile != null) formatFile.Close();
}
}
There was some reason I didn't use a using block for the data reader at the time.
It seems as if it is not possible for BCP to understand True and False as bit values. It's better to either go with SSIS or first replace the contents of the text (not a good idea to create views or anything like that, it is more overhead).

Rails3: SQL execution with hash substitution like .where()

With a simple model like that
class Model < ActiveRecord::Base
# ...
end
we can do queries like that
Model.where(["name = :name and updated_at >= :D", \
{ :D => (Date.today - 1.day).to_datetime, :name => "O'Connor" }])
Where the values in the hash will be substituted into the final SQL statement with proper escaping depending on the underlying database engine.
I would like to know a similar feature for SQL execution like:
ActiveRecord::Base.connection.execute( \
["update models set name = :name, hired_at = :D where id = :id;"], \
{ :id => 73465, :D => DateTime.now, :name => "O'My God" }] \
) # THIS CODE IS A FANTASY. NOT WORKING.
(Please do not solve the example with loading a Model object, modifying and then saving! The example is only an illustration for the feature I would like to have / know. Concentrate on the subject!)
The original problem is that I want to insert large amount (many thousand lines) of data into the database. I want to use some features of the SQL abstraction of the ActiveRecord framework but I don't want to use model objects based on ActiveRecord::Base because they are damn slow! (8 queries per second for my current problem.)
query = ActiveRecord::Base.connection.raw_connection.prepare("INSERT INTO users (name) VALUES(:name)")
query.execute(:name => 'test_name')
query.close
Extending the #peufeu solution with concrete code example for bulk insert:
users_places = []
users_values = []
timestamp = Time.now.strftime('%Y-%m-%d %H:%M:%S')
params[:users].each do |user|
users_places << "(?,?,?,?)"
users_values << user[:name] << user[:punch_line] << timestamp << timestamp
end
bulk_insert_users_sql_arr = ["INSERT INTO users (name, punch_line, created_at, updated_at) VALUES #{users_places.join(", ")}"] + users_values
begin
sql = ActiveRecord::Base.send(:sanitize_sql_array, bulk_insert_users_sql_arr)
ActiveRecord::Base.connection.execute(sql)
rescue
"something went wrong with the bulk insert sql query"
end
Here is the reference to sanitize_sql_array method in ActiveRecord::Base, it generates the proper query string by escaping the single quotes in the strings. For example the punch_line "Don't let them get you down" will become "Don\'t let them get you down".
Yes you could do raw SQL, but checkout the ar-extensions gem that helps with batch inserts:
https://github.com/zdennis/ar-extensions
Here's a post on it, and various other techniques:
http://www.coffeepowered.net/2009/01/23/mass-inserting-data-in-rails-without-killing-your-performance/
For INSERTs, batching them using a long VALUES clause (as shown by Simon's link) is the fastest way (unless you want to generate a text file and load it in your database with MySQL's LOAD DATA INFILE). But you have to be very careful about escaping your text values (which is not done in the example).
I was asking "what database are you using" because it does matter for mass UPDATEs.
For instance, you can do this on postgres (and I believe SQL Server changing "columnX" to "colX" ):
UPDATE foo
JOIN (VALUES (1,2),(3,4),... long list) v ON (foo.id=v.column1)
SET foo.bar = v.column2
And you can update a load of rows using a single statement, very fast.
If you don't need Ruby to perform some Ruby-specific magic on your data, the fastest way to transfer data from one DB to a different one is to export as a text file (CSV or tab separated), load it on the other DB (LOAD DATA INFILE on MySQL), perhaps in a temporary table, and bulk process using SQL.
EDIT : Here's how I do this in Python :
sql = [ "INSERT INTO foo (column list) VALUES " ]
values = []
for tuple in tuple_list:
append "(?,?,?,?)" to sql
extend values list with tuple
Then join sql into a string, you get "INSERT INTO foo (column list) VALUES (?,?,?,?),(?,?,?,?),(?,?,?,?)" with the "(?,?,?,?)" repeated as many times as you have lines to insert.
Then "values" contains a list of (a1,b1,c1,d1,a2,b2,c2,d2,a3,b3,c3,d3) with an,bn,cn,dn being the tuples you want to insert for line n. Each one corresponds to a placeholder in the sql string.
Then pass this to the usual "execute query with parameters" function which will handle quoting and escaping as usual.
I encountered a similar issue recently when tying to insert 100K+ records into a MySQL database for a Rails 4 app using mysql2 gem. The data included characters that had to be sanitized prior to insert.
The solution I ended going with was a slightly modified version of Option 3 described at https://www.coffeepowered.net/2009/01/23/mass-inserting-data-in-rails-without-killing-your-performance/
Here's the relevant code block from the above link:
TIMES = 10000
inserts = []
TIMES.times do
inserts.push "(3.0, '2009-01-23 20:21:13', 2, 1)"
end
sql = "INSERT INTO user_node_scores (`score`, `updated_at`, `node_id`, `user_id`) VALUES #{inserts.join(", ")}"
The modification I made was using the public method ActiveRecord::Base.sanitize() on values that required it.
inserts = []
created = Time.now.strftime "%Y-%m-%d %H:%M:%S"
params[:audits].each do |audit|
inserts.push "(#{audit.user_id), #{created}," + ActiveRecord::Base.sanitize(audit.comment) + ", #{audit.status})"
end
sql = "INSERT INTO user_audits (`user_id`, `created_at`, `comment`, `status`) VALUES #{inserts.join(", ")}"