Encryption in SQL (SQLite) - sql

I want to encrypt particular data I wish in the tables..
For example,
I need something like this:
update mytable set column1 = encrypt(column1, "key") where condition;
and this:
select decrypt(column1, "key") from mytable where condition;
Is threre any simple in-built SQL function in SQLite to accomplish
this?
I have a Java function for encrypt() and decrypt(), I need to bulk encrypt the table column and it will be too slow if I read the column, apply the function and then write back. Please advise.
Thanks in advance.

SQLite has possibility to supply your own functions - create_function sqlite function - that you can than use in the SQL statement.
So I would look for create_function in java, e.g:
http://ppewww.physics.gla.ac.uk/~tdoherty/sqlite/javasqlite-20050608/doc/SQLite/Database.html
or here is even an example
http://www.daniweb.com/software-development/java/threads/221260
For Example, in python it looks like this (example taken from http://docs.python.org/library/sqlite3.html Connection.create_function)
import sqlite3
import md5
def md5sum(t):
return md5.md5(t).hexdigest()
con = sqlite3.connect(":memory:")
con.create_function("md5", 1, md5sum)
cur = con.cursor()
cur.execute("select md5(?)", ("foo",))
print cur.fetchone()[0]
Or: http://www.sqlite.org/c3ref/create_function.html

Related

R: Summary of SQL Tables

I am working with the R programming language.
Normally, when I want to get the summary of a table, I can use something like the "str()" function or the "summary()" function:
str(my_table)
summary(my_table)
However, now I am trying to do this with tables on a server.
For instance, I am trying to get the summaries of variable types for a specific table (e.g. "my_table") on a server. I found a very indirect way to do this:
#load libraries
library(OBDC)
library(RODBC)
library(dbi)
#establish a connection and name it as "dbhandle"
rs <- dbSendQuery(dbhandle, 'select * from my_table limit 1')
dbColumnInfo(rs)
My Question: Is there a more "direct" way to do this? For example, can I get information about each column (e.g. whether the column is integer, character, date, etc.) in a table without first sending the query and then requesting the information? Can I do this directly?
Thanks!
You could try using fetch() from "RMySQL" to turn your SQL query into an R object (e.g. data frame)
library(RMySQL)
rs <- dbSendQuery(dbhandle, 'select * from my_table limit 1')
# Get the results from MySQL into R
my_table = fetch(rs, n=-1)
# clear result
dbClearResult(rs)
rm(rs)
Then use the functions you describe.
str(my_table)
summary(my_table)

Select Statement in SQLite Python - Using Variables in WHERE clause

Say I have a class variable restemail which stores the email id I need to use to sort out from the select statement in SQLite (Python). Whenever I refer to that variable after my WHERE clause, SQLite treats it as a column and returns an error saying that such a column doesn't exist. Something like this:
restemail=StringVar()
Password=StringVar()
def database(self):
conn = sqlite3.connect('data.db')
with conn:
cursor=conn.cursor()
strrest = self.restemail
cursor.execute('SELECT * FROM Restaurant3 WHERE restemail = strrest')
Can someone tell me how to use a variable inside my SQL queries without it being treated as a column name?
Any help will be appreciated.
Try the sqlite3 variable substitution syntax:
cursor.execute('SELECT * FROM Restaurant3 WHERE restemail = ?', (strrest,))

IPython SQL Magic - Generate Query String Programmatically

I'm generating SQL programmatically so that, based on certain parameters, the query that needs to be executed could be different (i.e., tables used, unions, etc). How can I insert a string like this: "select * from table", into a %%sql block? I know that using :variable inserts variable into the %%sql block, but it does so as a string, rather than sql code.
The answer was staring me in the face:
query="""
select
*
from
sometable
"""
%sql $query
If you want to templatize your queries, you can use string.Template:
from string import Template
template = Template("""
SELECT *
FROM my_data
LIMIT $limit
""")
limit_one = template.substitute(limit=1)
limit_two = template.substitute(limit=2)
%sql $limit_one
Source: JupySQL documentation.
Important: If you use this approach, ensure you trust/sanitize the input!

multiple parameter "IN" prepared statement

I was trying to figure out how can I set multiple parameters for the IN clause in my SQL query using PreparedStatement.
For example in this SQL statement, I'll be having indefinite number of ?.
select * from ifs_db where img_hub = ? and country IN (multiple ?)
I've read about this in
PreparedStatement IN clause alternatives?
However I can't figure it out how to apply it to my SQL statement above.
There's not a standard way to handle this.
In SQL Server, you can use a table-valued parameter in a stored procedure and pass the countries in a table and use it in a join.
I've also seen cases where a comma-separated list is passed in and then parsed into a table by a function and then used in a join.
If your countries are standard ISO codes in a delimited list like '#US#UK#DE#NL#', you can use a rather simplistic construct like:
select * from ifs_db where img_hub = ? and ? LIKE '%#' + country + '#%'
Sormula will work for any data type (even custom types). This example uses int's for simplicity.
ArrayList<Integer> partNumbers = new ArrayList<Integer>();
partNumbers.add(999);
partNumbers.add(777);
partNumbers.add(1234);
// set up
Database database = new Database(getConnection());
Table<Inventory> inventoryTable = database.getTable(Inventory.class);
ArrayListSelectOperation<Inventory> operation =
new ArrayListSelectOperation<Inventory>(inventoryTable, "partNumberIn");
// show results
for (Inventory inventory: operation.selectAll(partNumbers))
System.out.println(inventory.getPartNumber());
You could use setArray method as mentioned in the javadoc below:
http://docs.oracle.com/javase/6/docs/api/java/sql/PreparedStatement.html#setArray(int, java.sql.Array)
Code:
PreparedStatement statement = connection.prepareStatement("Select * from test where field in (?)");
Array array = statement.getConnection().createArrayOf("VARCHAR", new Object[]{"AA1", "BB2","CC3"});
statement.setArray(1, array);
ResultSet rs = statement.executeQuery();

Django select only rows with duplicate field values

suppose we have a model in django defined as follows:
class Literal:
name = models.CharField(...)
...
Name field is not unique, and thus can have duplicate values. I need to accomplish the following task:
Select all rows from the model that have at least one duplicate value of the name field.
I know how to do it using plain SQL (may be not the best solution):
select * from literal where name IN (
select name from literal group by name having count((name)) > 1
);
So, is it possible to select this using django ORM? Or better SQL solution?
Try:
from django.db.models import Count
Literal.objects.values('name')
.annotate(Count('id'))
.order_by()
.filter(id__count__gt=1)
This is as close as you can get with Django. The problem is that this will return a ValuesQuerySet with only name and count. However, you can then use this to construct a regular QuerySet by feeding it back into another query:
dupes = Literal.objects.values('name')
.annotate(Count('id'))
.order_by()
.filter(id__count__gt=1)
Literal.objects.filter(name__in=[item['name'] for item in dupes])
This was rejected as an edit. So here it is as a better answer
dups = (
Literal.objects.values('name')
.annotate(count=Count('id'))
.values('name')
.order_by()
.filter(count__gt=1)
)
This will return a ValuesQuerySet with all of the duplicate names. However, you can then use this to construct a regular QuerySet by feeding it back into another query. The django ORM is smart enough to combine these into a single query:
Literal.objects.filter(name__in=dups)
The extra call to .values('name') after the annotate call looks a little strange. Without this, the subquery fails. The extra values tricks the ORM into only selecting the name column for the subquery.
try using aggregation
Literal.objects.values('name').annotate(name_count=Count('name')).exclude(name_count=1)
In case you use PostgreSQL, you can do something like this:
from django.contrib.postgres.aggregates import ArrayAgg
from django.db.models import Func, Value
duplicate_ids = (Literal.objects.values('name')
.annotate(ids=ArrayAgg('id'))
.annotate(c=Func('ids', Value(1), function='array_length'))
.filter(c__gt=1)
.annotate(ids=Func('ids', function='unnest'))
.values_list('ids', flat=True))
It results in this rather simple SQL query:
SELECT unnest(ARRAY_AGG("app_literal"."id")) AS "ids"
FROM "app_literal"
GROUP BY "app_literal"."name"
HAVING array_length(ARRAY_AGG("app_literal"."id"), 1) > 1
Ok, so for some reason none of the above worked for, it always returned <MultilingualQuerySet []>. I use the following, much easier to understand but not so elegant solution:
dupes = []
uniques = []
dupes_query = MyModel.objects.values_list('field', flat=True)
for dupe in set(dupes_query):
if not dupe in uniques:
uniques.append(dupe)
else:
dupes.append(dupe)
print(set(dupes))
If you want to result only names list but not objects, you can use the following query
repeated_names = Literal.objects.values('name').annotate(Count('id')).order_by().filter(id__count__gt=1).values_list('name', flat='true')