I have data coming from an API in JSON format which I then run a few functions/transformations in python and then insert the data in to an SQL database through pandas & SQLalchemy.
Now, how do I automatically do this at the end of every day, without having to open up the script and run it manually?
You can use crontab on a server (or your Linux/Mac laptop but it will not run the script if its turned off of course).
You can run crontab -e to edit the crontab file. Add something like the following to run your script everyday at 11 PM:
0 23 * * * ~/myscript.py
crontab guru is a useful resource to try different schedule expressions.
Related
I have a SQL dump that I need to import using postgresql into pgadmin4, however when I run the command, the schema gets created but none of the data comes with it, I have the database set up in pgadmin4. This is my first time using postgresql and pgadmin so I know I have to be missing something.
The SQL dump file was sent to me directly, I did not use pg_dump to migrate anything, the file is in my downloads and I need to plug it in to pgadmin.
I need this SQL dump because I need to log into several portals locally for a large project.
On Windows, using postgres version 14, I've tried several ways from other solutions on stack overflow, first using the command line in both bash and powershell
This here is the command I was told to use that should add the tables and data for the app from a coworker, and it worked fine for him.
C:\Program Files\PostgreSQL\14\bin>psql -h localhost -U postgres -d the_database -f PATH_TO_YOUR_DOWNLOADS\data_dump.sql
This command will create the schema in the pgadmin database but no data comes with it. (I know the data is missing because I cant use my dummy logins to get into the project)
Second, I tried using the built in restore and backup methods in pgadmin and both of those end in an error
`Process failed Restoring backup on the server 'PostgreSQL 14 (localhost:5432)
Third I tried using the query tool and link the sql file that way, but when I hit execute I get an error there as well.
Using the query tool, when I link the download file, I can see the data in the Query, but it is not in the database.
ERROR: syntax error at or near "2"
LINE 3285: 2 Some Test 2020-11-13 07:42:29.356827 2020-11-13 04:32:...
^
SQL state: 42601
Character: 87447
Any advice?
Do I need the SQL file formatted in any certain way?
I just need the data to be imported into pgadmin4 database WITH my schema.
Is there any possibility to insert 50k datasets into a postgresql database using dbeaver?
Locally, it worked fine for me, it took me 1 minute, because I also changed the memory settings of postgresql and dbeaver. But for our development environment, 50k queries did not work.
Is there a way to do this anyway or do I need to split the queries and do for example 10k queries 5 times? Any trick?
EDIT: with "did not work" I mean I got an error after 2500 seconds saying something like "too much data ranges"
If you intend to execute a giant script sql via interface: don't even try.
If you have a csv file, DBeaver gives you a tool:
Even better, as described in comments, copy command is the tool.
If you have a giant SQL file you need to use command line, like:
psql -h host -U username -d myDataBase -a -f myInsertFile
Like in this post: Run a PostgreSQL .sql file using command line arguments
I've PL SQL Block ; which needs to be executed multiple times in a day.
This Block updates data in Microsoft SQL Server.
Is there any way; I can connect MS SQL Database from Linux box and schedule query execution multiple times in a day?
Write a script and then use crontab to schedule the task to run as often as you would like.
To Edit: crontab -e
While in crontab it works just like vi. So to edit, press i. To stop editing press the Esc. To save and quit type :wq.
To view: crontab -l
for questions: man crontab
crontab example:54 14 * * * myJob This will run "myJob" at 2:54pm, daily.
You could also use something like this to help you make the schedule for crontab http://crontab-generator.org/
I have a report that I need to run everyday # 00:00 and export all the information from the table to a specific location with a specific name.
Example:
select * from my_table
where date between SYSTIMESTAMP -2 and SYSTIMESTAMP -1
and to export this to file date.xml.
Is this possible from Oracle SQL Developer or do I need other tools?
No Oracle version so I assume 10 or 11.
To schedule your process you just have to create a job and schedule it. The job has to run your script (which can be a function or a stored procedure).
Here is the documentation:
http://docs.oracle.com/cd/B28359_01/server.111/b28310/scheduse.htm#i1033533
To write to a file you can use the spool command in SQL. Here you can find the documentation: http://docs.oracle.com/cd/B19306_01/server.102/b14357/ch12043.htm
It's really simple to use.
spool #path/nomefile
la tua query
spool off
Obviously, the machine from which you run the script must have write permissions on the machine where you're going to write the file (I say this because I often forget to check out).
To create an XML is slightly more complex and a little long to explain here but there is a nice post on the community of Oracle that explains it and makes a simple and practical example: https://community.oracle.com/thread/714758?start=0&tstart=0
If you do not want to use a job in Oracle you can write a .Sql file with the connection commands, the spool command and your query and schedule it on the server machine on which you intend to set as a simple command sqlplus.
As part of an ongoing research work, I am checking if an URL exists or not using the cURL command. I have been executing a shell script for couple of days and it is doing some updates for each URL in my database. However, the script seems to update around only 100,000 rows in a day.
I was thinking if I could write the values in a file first and then do the updates, the execution might be faster.
I am connecting to the database using the command line.
mysql -h servername -u username -ppassword databasename "Update Query"
For example, instead of connecting to the database 2 million times like above from the command line and updating 2 million rows, I am planning to connect to the database only once from the command line and update 2 million rows from the file.
So is the second approach better than the first one or the time difference would be negligible?
Three approaches.
You could using load data infile
You could build up a .sql file with all of the updates you need.
You could use something other than a CLI to connect to the URLs and DB. In other words, not using "curl" and "mysql" commands, but using a real programming language and provided libraries for checking URLs and updating databases.
Any of those would probably be faster. Though you'll likely get more speed improvement by making the http calls in parallel. You can do that more easily with a real programming language.