I created a backup cmd file with this code
EXPDP system/system EXCLUDE=statistics DIRECTORY=bkp_dir DUMPFILE=FULLDB.DMP LOGFILE=FULLDB.log FULL=Y
it works good, but, when I run the backup again, it finds that the file exists
and terminate the process. it will not run unless I delete the previous file or rename it. I want to add something to the dumpfile and logfile name that creates a daily difference between them, something like the system date, or a copy number or what else.
The option REUSE_DUMPFILES specifies whether to overwrite a preexisting dump file.
Normally, Data Pump Export will return an error if you specify a dump
file name that already exists. The REUSE_DUMPFILES parameter allows
you to override that behavior and reuse a dump file name.
If you wish to dump separate file names for each day, you may use a variable using date command in Unix/Linux environment.
DUMPFILE=FULLDB_$(date '+%Y-%m-%d').DMP
Similar techniques are available in Windows, which you may explore if you're running expdp in Windows environment.
Related
I want to read a tab delimited file using PLSQL and insert the file data into a table.
Everyday new file will be generated.
I am not sure if external table will help here because filename will be changed based on date.
Filename: SPRReadResponse_YYYYMMDD.txt
Below is the sample file data.
Option that works on your own PC is to use SQL*Loader. As file name changes every day, you'd use your operating system's batch script (on MS Windows, these are .BAT files) to pass a different name while calling sqlldr (and the control file).
External table requires you to have access to the database server and have (at least) read privilege on its directory which contains those .TXT files. Unless you're a DBA, you'll have to talk to them to provide environment. As of changing file name, you could use alter table ... location which is rather inconvenient.
If you want to have control over it, use UTL_FILE; yes, you still need to have access to that directory on the database server, but - writing a PL/SQL script, you can modify whatever you want, including file name.
Or, a simpler option, first rename input file to SPRReadResponse.txt, then load it and save yourself of all that trouble.
Text file contains data like
1,'name',34
2,'name1',23
If you have Access Client Solutions, you can use the File Transfer function to upload the file.
Also can upload directly from Excel if the file is open:
https://www.ibm.com/support/pages/transferring-data-excel-using-access-client-solutions
When the Db2-server runs on Linux/Unix/Windows, you can CALL a stored procedure to do import or load.
BUT, the file to be imported or loaded must already be on the Db2-server, or on a file-system that the Db2-server process can read. So any filenames are relative to the Db2-server (not to your workstation, unless of course the Db2-server runs on your workstation).
If the target table already exists, the connected-userid needs suitable permissions on that table. If the target table does not exist, you need to create it first.
Also the userid needs execute permission on the stored procedure that does the work.
So there are three steps:
copy the file to be imported (or loaded) to a location that the Db2-server can read.
call the ADMIN_CMD stored procedure with parameters telling it what to do, in this case to import a file.
Examine the result-set of the stored procedure to see what happened. If the import or load failed, you need to run the SQL listed in the MSG_RETRIEVAL column of the result-set to see why it failed (assuming you used MESSAGES ON SERVER option to import).
See the Db2 documentation online here for import or load
There are also many examples here on stackoverflow.
So do your research and learn.
On Db2 11.5 you can use a REMOTE TABLE to import a text file into Db2
Use the REMOTESOURCE YES option if the file is on your client and not on a directory visible to the database server
https://www.ibm.com/support/knowledgecenter/en/SSEPGG_11.5.0/com.ibm.db2.luw.sql.ref.doc/doc/r_create_ext_table.html?pos=2
Note: I believe I may be missing a simple solution to this problem. I'm relatively new to programming. Any advice is appreciated.
The problem: A small team of people (~3-5) want to be able to automate, as far as possible, the filing of downloaded files in appropriate folders. Files will be downloaded into a shared downloads folder. The files in this downloads folder will be sorted into a large shared folder structure according to their file-type, URL the file was downloaded from, and so on and so forth. These files are stored on a shared server, and the actual sorting will be done by some kind of shell script running on the server itself.
Whilst there are some utilities which do this (such as Maid), they don't do everything I want them to do. Maid for example doesn't have a way to get the the download url of a file in Linux. Additionally, it is written in Ruby, which I'd like to avoid.
The biggest stumbling block then is finding a find a way to get the url of the downloaded file that can be passed into the shell script. Originally I thought this could be done via getfattr, which would get a file's extended attributes. Frustratingly however, whilst chromium saves a file's download url as an extended attribute, Firefox doesn't seem to do the same thing. So relying on extended attributes seems to be out of the question.
What Firefox does do however is store download 'metadata' in the places.sqlite file, in two separate tables - moz_annos and moz_places. Inspired by this, I dediced to build a Firefox extension that writes all information about the downloaded file to a SQLite database downloads.sqlite on our server upon the completion of said download. This includes the url, MIME type, etc. of the downloaded file.
The idea is that with this data, the server could run a shell script that does some fine-grained sorting of the downloaded file into our shared file system.
However, I am struggling to find out a stable, reliable, and portable way of 'triggering' the script that will actually move the files, as well as passing information about these files to the script so that it can sort them accordingly.
There are a few ways I thought I could go about this. I'm not sure which method is the most appropriate:
1) Watch Downloads Folder
This method would watch for changes to the shared downloads directory, then use the file name of the downloaded file to query downloads.sqlite, getting the matching row, then finally passing the file's attributes into a bash script that sorts said file.
Difficulties: Finding a way to reliably match the downloaded file with the appropriate record in the database. Files may have the same download name but need to be sorted differently, perhaps, for example, if they were downloaded from a different URL. Additionally, I'd like to get additional attributes like whether the file was downloaded in incognito mode.
2) Create Auxillary 'Helper' File
Upon a file download event, the extension creates a 'helper' text file, which is the name of the file + some marker that contains the additional file attribute:
/Downloads/
mydownload.pdf
mydownload-downloadhelper.txt
The server can then watch for the creation of a .txt file in the downloads directory run the necessary shell script from this.
Difficulties: Whilst this avoids using a SQlite databse, it seems rather ungraceful and hacky, and I can see a multitude of ways in which this method would just break or not work.
3) Watch
SQlite Database
This method writes to the shared SQlite database downloads.sqlite on server. Then, by some method, watch for a new insert of a row into this database. This could either be by watching the sqlite databse for a new INSERT on a table, or have a sqlite trigger on INSERT that runs a bash script, passing on the download information into a shell script.
Difficulties: there doesn't seem to be any easy way to watch an SQlite database for a new row insert, and a trigger within SQlite doesn't seem to be able to launch an external script/program. I've searched high and low for a method of doing either of these two processes, but I'm struggling to find any documented way to do it that I am able to understand.
What I would like is :
Some feedback on which of these methods is appropriate, or if there is a more appropriate method that I am overlooking.
An example of a system/program that does something similar to this.
Many thanks in advance.
It seems to me that you have put "the cart in front of the horse":
Use cron to periodically check for new downloads. Process them on the command line instead of trying to trigger things from inside sqlite3:
a) Here is an approach using your shared sqlite3 database "downloads.sqlite":
Upfront once:
Add a table to your database containing just an integer as record counter and a timeStamp field, e.g., "table_counter":
sqlite3 downloads.sqlite "CREATE TABLE "table_counter" ( "counter" INTEGER PRIMARY KEY NOT NULL, "timestamp" DATETIME DEFAULT (datetime('now','UTC')));" 2>/dev/null
Insert an initial record into this new table setting the "counter" to zero and recording a timeStamp:
sqlite3 downloads.sqlite "INSERT INTO "table_counter" VALUES (0, (SELECT datetime('now','UTC')));" 2>/dev/null
Every so often:
Query the table containing the downloads with a "SELECT COUNT(*)" statement:
sqlite3 downloads.sqlite "SELECT COUNT(*) from table_downloads;" 2>/dev/null
Result e.g., 20
Compare this number to the number stored in the record counter field:
sqlite3 downloads.sqlite "SELECT (counter) from table_counter;" 2>/dev/null
Result e.g., 17
If result from 3) > result from 4), then you have downloaded more files than processed.
If so, query the table containing the downloads with a "SELECT" statement for the oldest not yet processed download, using a "subselect":
sqlite3 downloads.sqlite "SELECT * from table_downloads where rowid = (SELECT (counter+1) from table_counter);" 2>/dev/null
In my example this would SELECT all values for the data record with the rowid of 17+1 = 18;
Do your magic in regards to the downloaded file stored as record #18.
Increase the record counter in the "table_counter", again using a subselect:
sqlite3 downloads.sqlite "UPDATE table_counter SET counter = (SELECT (counter) from table_counter)+1;" 2>/dev/null
Finally, update the timeStamp for the "table_counter":
Why? Shit happens on shared drives... This way you can always check how many download records have been processed and when this has happened last time.
sqlite3 downloads.sqlite "UPDATE table_counter SET timeStamp = datetime('now','UTC');" 2>/dev/null
If you want to have a log of this processing then change the SQL statements in 4) to a "SELECT COUNT(*)" and in 7) to an "INSERT counter" and its subselect to an "(SELECT (counter+1) from table_counter)" respectively ...
Please note: The redirections " 2>/dev/null" at the end of the SQL statements are just to suppress this kind of line issued by newer versions of SQLite3 before showing your query results.
-- Loading resources from /usr/home/bernie/.sqliterc
If you don't like timeStamps based on UTC then use localtime instead:
(datetime('now','localtime'))
Put steps 3) inclusive 8) in a shell-script and use a cron entry to run this query/comparism periodically...
Use the complete /path/to/sqlite3 in this shell-script (just in case running on a shared drive. Someone could be fooling around with paths and could surprise your cron ...)
b) I will give you a simpler answer using awk and some hash like md5 in a separate answer.
So it is easier for future readers and easier for you to "rate" :-)
I am trying to read and delete any specific file in sql server .
Scenario
When i am trying to publish my database using a Post Deployment Script , it automatically takes backup of database in default Backup directory . Now i am getting that folder bigger day by day . Now i want to write a sql job that will execute and will delete all bakcup files except last two . Is this possible and also is this appropriate to do in sql . If no then why ?
You can create SQL Job to delete file from file system.
First of all, you must have SQL Agent service started.
Next step is creating powershell script for deleting desired files. Place script on, by example, D:\Test\ folder and give a name to script like DeleteFile.ps1. Edit script, and write next code:
$files = Get-ChildItem D:\Test *.txt | Select Name, CreationTime, FullName | Sort CreationTime
for ($i=0; $i -lt ($files.Count -2); $i++) {
Remove-Item $files[$i].FullName
}
and then save and close editor.
Then, under SQL Agent tree, right click on Jobs folder and choose New Job. On dialog window, in General tab, give a name to job, and then on left side of the window, choose Steps, and then New.
On new dialog window, in section Type, choose Operating system (CmdExec). In field Command, just call powershell script you create for deleting desired files:
D:\Test\DeleteFile.ps1
and click OK to create job.
When done, to test deleting files, right click on created job, and then choose Start Job as Step option.
NOTE
It is suggested to test all this work on some test machine before put it in production. Files which I delete is .txt files, but you just need to change code to map .bak files. Of course, you can change location of files.
Now i am getting that folder bigger day by day . Now i want to write a sql job that will execute and will delete all bakcup files except last two
The default option when you take backup is to append the current backup file with previous one. The option is (NOINIT). Please see backup documentation What you have to do is add INIT option while taking the backup, this would overwrite the previous backup.
If your scenario is such that you just want to remove old backup files this can be done by writing custom script. Fortunately you have Ola Hallengren script for this. I strongly suggest you to first read FAQ's and excample. reading examples wll make things clear for you as to how the script works
i have created a table in Hive "sample" and loaded a csv file "sample.txt" into it.
now i need that data from "sample" into my local /opt/zxy/sample.txt.
How can i do that?
Hortonworks' Sandbox lets you do it through its HCatalog menu. Otherwise, the syntax is
INSERT OVERWRITE LOCAL DIRECTORY '/tmp/c' SELECT a.* FROM b
as per Hive language manual
Since your intention is just to copy the entire file from HDFS to your local FS, I would not suggest you to do it through a Hive query, because of the following reasons :
It'll start a Mapreduce job which will take more time than a normal copy.
It'll create file(s) with different names(000000_0, 000001_0 and so on), which will require you to rename the file manually afterwards.
You might face problem in opening these files as they are without any extension. Your OS would be unable to choose an application to open these files on its own. In such a case you either have to rename the file or manually select an application to open it.
To avoid these problems you could use HDFS get command :
bin/hadoop fs -get /user/hive/warehouse/sample/sample.txt /opt/zxy/sample.txt
Simple n easy. But if you need to copy some selected data, then you have to use a Hive query.
HTH
I usually run my query directly through Hive on the command line for this kind of thing, and pipe it into the local file like so:
hive -e 'select * from sample' > /opt/zxy/sample.txt
Hope that helps.
Readers who are accessing Hive from Windows OS can check out this script on Github.
It's a Python+paramiko script that extracts Hive data to local Windows OS file-system.