I am currently working on a project and I want to know how to save an sqllite database in rails as a csv file. I want it when you click the button, the current database on the system download. Can anybody help me? Thanks!
Your problem isn't really specific to Rails. Instead, you're mostly dealing with an administrative issue. You should write a script to export your database as csv, something like this:
#!/bin/bash
./bin/sqlite3 ./my_app/db/my_database.db <<!
.headers on
.mode csv
.output my_output_file.csv
select * from my_table;
!
This script exports a single table. If you have additional tables, you'll want to add them to your script.
The only Rails related issue is the matter of calling that script. Save the script within your application structure; I'd suggest my_app/assets or some similar location.
Now you can run that script using system(command) where command is the absolute path for your script, within a set of double-quotes.
Related
I have downloaded a CSV file and am trying to use it for a SQL project (I am using Jupyter notebooks). Do I even need the CSV file or is there a way to use it without downloading it? I'm very new to all of this!
This is the link to the data that I downloaded:
https://github.com/new-york-civil-liberties-union/NYPD-Misconduct-Complaint-Database
What's your goal? Are you trying to learn SQL, or do you just want to work with the data?
If all you want is to load that csv into a table in a SQLite database, it would be easiest to do using the sqlite command line shell.
I'm on Windows, so forgive me if you aren't...
Open the Command Prompt
Navigate to the folder in which you want the new sqlite db file (e.g. cd C:\Users\User\Data)
sqlite3 NewDBName.db (e.g. sqlite3 MyNewDb.db)
.mode csv
.import path/to/downloaded/csv.csv NewTableName (e.g. .import C:\Users\User\Downloads\CCRB_database_raw.csv CCRB
That should be it. You can check it worked by running .schema- you should see the structure of your new tables.
Now you can try out some sql statements:
SELECT * FROM CCRB LIMIT 10;
Here are some more detailed instructions.
Note: I believe I may be missing a simple solution to this problem. I'm relatively new to programming. Any advice is appreciated.
The problem: A small team of people (~3-5) want to be able to automate, as far as possible, the filing of downloaded files in appropriate folders. Files will be downloaded into a shared downloads folder. The files in this downloads folder will be sorted into a large shared folder structure according to their file-type, URL the file was downloaded from, and so on and so forth. These files are stored on a shared server, and the actual sorting will be done by some kind of shell script running on the server itself.
Whilst there are some utilities which do this (such as Maid), they don't do everything I want them to do. Maid for example doesn't have a way to get the the download url of a file in Linux. Additionally, it is written in Ruby, which I'd like to avoid.
The biggest stumbling block then is finding a find a way to get the url of the downloaded file that can be passed into the shell script. Originally I thought this could be done via getfattr, which would get a file's extended attributes. Frustratingly however, whilst chromium saves a file's download url as an extended attribute, Firefox doesn't seem to do the same thing. So relying on extended attributes seems to be out of the question.
What Firefox does do however is store download 'metadata' in the places.sqlite file, in two separate tables - moz_annos and moz_places. Inspired by this, I dediced to build a Firefox extension that writes all information about the downloaded file to a SQLite database downloads.sqlite on our server upon the completion of said download. This includes the url, MIME type, etc. of the downloaded file.
The idea is that with this data, the server could run a shell script that does some fine-grained sorting of the downloaded file into our shared file system.
However, I am struggling to find out a stable, reliable, and portable way of 'triggering' the script that will actually move the files, as well as passing information about these files to the script so that it can sort them accordingly.
There are a few ways I thought I could go about this. I'm not sure which method is the most appropriate:
1) Watch Downloads Folder
This method would watch for changes to the shared downloads directory, then use the file name of the downloaded file to query downloads.sqlite, getting the matching row, then finally passing the file's attributes into a bash script that sorts said file.
Difficulties: Finding a way to reliably match the downloaded file with the appropriate record in the database. Files may have the same download name but need to be sorted differently, perhaps, for example, if they were downloaded from a different URL. Additionally, I'd like to get additional attributes like whether the file was downloaded in incognito mode.
2) Create Auxillary 'Helper' File
Upon a file download event, the extension creates a 'helper' text file, which is the name of the file + some marker that contains the additional file attribute:
/Downloads/
mydownload.pdf
mydownload-downloadhelper.txt
The server can then watch for the creation of a .txt file in the downloads directory run the necessary shell script from this.
Difficulties: Whilst this avoids using a SQlite databse, it seems rather ungraceful and hacky, and I can see a multitude of ways in which this method would just break or not work.
3) Watch
SQlite Database
This method writes to the shared SQlite database downloads.sqlite on server. Then, by some method, watch for a new insert of a row into this database. This could either be by watching the sqlite databse for a new INSERT on a table, or have a sqlite trigger on INSERT that runs a bash script, passing on the download information into a shell script.
Difficulties: there doesn't seem to be any easy way to watch an SQlite database for a new row insert, and a trigger within SQlite doesn't seem to be able to launch an external script/program. I've searched high and low for a method of doing either of these two processes, but I'm struggling to find any documented way to do it that I am able to understand.
What I would like is :
Some feedback on which of these methods is appropriate, or if there is a more appropriate method that I am overlooking.
An example of a system/program that does something similar to this.
Many thanks in advance.
It seems to me that you have put "the cart in front of the horse":
Use cron to periodically check for new downloads. Process them on the command line instead of trying to trigger things from inside sqlite3:
a) Here is an approach using your shared sqlite3 database "downloads.sqlite":
Upfront once:
Add a table to your database containing just an integer as record counter and a timeStamp field, e.g., "table_counter":
sqlite3 downloads.sqlite "CREATE TABLE "table_counter" ( "counter" INTEGER PRIMARY KEY NOT NULL, "timestamp" DATETIME DEFAULT (datetime('now','UTC')));" 2>/dev/null
Insert an initial record into this new table setting the "counter" to zero and recording a timeStamp:
sqlite3 downloads.sqlite "INSERT INTO "table_counter" VALUES (0, (SELECT datetime('now','UTC')));" 2>/dev/null
Every so often:
Query the table containing the downloads with a "SELECT COUNT(*)" statement:
sqlite3 downloads.sqlite "SELECT COUNT(*) from table_downloads;" 2>/dev/null
Result e.g., 20
Compare this number to the number stored in the record counter field:
sqlite3 downloads.sqlite "SELECT (counter) from table_counter;" 2>/dev/null
Result e.g., 17
If result from 3) > result from 4), then you have downloaded more files than processed.
If so, query the table containing the downloads with a "SELECT" statement for the oldest not yet processed download, using a "subselect":
sqlite3 downloads.sqlite "SELECT * from table_downloads where rowid = (SELECT (counter+1) from table_counter);" 2>/dev/null
In my example this would SELECT all values for the data record with the rowid of 17+1 = 18;
Do your magic in regards to the downloaded file stored as record #18.
Increase the record counter in the "table_counter", again using a subselect:
sqlite3 downloads.sqlite "UPDATE table_counter SET counter = (SELECT (counter) from table_counter)+1;" 2>/dev/null
Finally, update the timeStamp for the "table_counter":
Why? Shit happens on shared drives... This way you can always check how many download records have been processed and when this has happened last time.
sqlite3 downloads.sqlite "UPDATE table_counter SET timeStamp = datetime('now','UTC');" 2>/dev/null
If you want to have a log of this processing then change the SQL statements in 4) to a "SELECT COUNT(*)" and in 7) to an "INSERT counter" and its subselect to an "(SELECT (counter+1) from table_counter)" respectively ...
Please note: The redirections " 2>/dev/null" at the end of the SQL statements are just to suppress this kind of line issued by newer versions of SQLite3 before showing your query results.
-- Loading resources from /usr/home/bernie/.sqliterc
If you don't like timeStamps based on UTC then use localtime instead:
(datetime('now','localtime'))
Put steps 3) inclusive 8) in a shell-script and use a cron entry to run this query/comparism periodically...
Use the complete /path/to/sqlite3 in this shell-script (just in case running on a shared drive. Someone could be fooling around with paths and could surprise your cron ...)
b) I will give you a simpler answer using awk and some hash like md5 in a separate answer.
So it is easier for future readers and easier for you to "rate" :-)
This is probably a simple question, but I could use some help. I am trying to build a small database for an application that will only be run on my computer so I want to create a local database.
To do this I am trying to use sqlite. I can use the command prompt to make what seems to be a database by using the sqlite3 databaseName; functionality, but I do not know where it is being stored.
I need to be able to find the database to access it through the application I am experimenting with. I already know all of the basic sql and such for creating the database tables and data, but I cannot figure out how to simply make the database connection.
is there a way to specify where the database .db file will be stored, and why can I not find the file it seems to be making?
Using sqlite3 shell? Some help using sqlite3 -help:
Usage: sqlite3 [OPTIONS] FILENAME [SQL]
If FILENAME is not supplied, shell uses an temporary database.
If you start shell without supplying a filename, you may save temporary database at any time using:
sqlite> .backup MAIN "folder\your_file.extension"
Or you can ATTACH an existing database an use SQL methods:
sqlite> ATTACH DATABASE "path\stored.db" AS other;
sqlite> INSERT OR REPLACE INTO other.table1 SELECT * FROM this_table1;
sqlite> DETACH other;
For doing such things , you can use Sqlite Manager , you'll get it as a Firefox addon. It's excellent in creating / Managing Sqlite database.
https://addons.mozilla.org/en-US/firefox/addon/sqlite-manager/
Thanks everyone for answering, but it turns out my issue was much simpler than I thought.
I was trying to name the database after already starting the shell.
I was supposed to create the database from command line by doing sqlite3 name.db
But I was trying to use that command within the sqlite shell so nothing was being created.
i have created a table in Hive "sample" and loaded a csv file "sample.txt" into it.
now i need that data from "sample" into my local /opt/zxy/sample.txt.
How can i do that?
Hortonworks' Sandbox lets you do it through its HCatalog menu. Otherwise, the syntax is
INSERT OVERWRITE LOCAL DIRECTORY '/tmp/c' SELECT a.* FROM b
as per Hive language manual
Since your intention is just to copy the entire file from HDFS to your local FS, I would not suggest you to do it through a Hive query, because of the following reasons :
It'll start a Mapreduce job which will take more time than a normal copy.
It'll create file(s) with different names(000000_0, 000001_0 and so on), which will require you to rename the file manually afterwards.
You might face problem in opening these files as they are without any extension. Your OS would be unable to choose an application to open these files on its own. In such a case you either have to rename the file or manually select an application to open it.
To avoid these problems you could use HDFS get command :
bin/hadoop fs -get /user/hive/warehouse/sample/sample.txt /opt/zxy/sample.txt
Simple n easy. But if you need to copy some selected data, then you have to use a Hive query.
HTH
I usually run my query directly through Hive on the command line for this kind of thing, and pipe it into the local file like so:
hive -e 'select * from sample' > /opt/zxy/sample.txt
Hope that helps.
Readers who are accessing Hive from Windows OS can check out this script on Github.
It's a Python+paramiko script that extracts Hive data to local Windows OS file-system.
Suppose I have wrote script Table_ABC.sql which creates table ABC. I have created many such scripts for each of required tables. Now i want to write a script that call all of these script files in a sequence so basically I want another script file createTables.sql. Mysql provides option to execute a script file from "mysql" shell application but could find some command like exec c:/myscripts/mytable.sql. Please tell me if there is any command that can be written in sql script itself to call other one in latest mysql versions or alternative for same.
Thanks
You can use source command. So your script will be something like:
use your_db;
source script/s1.sql;
source script/s2.sql;
-- so on, so forth