Scrapy - Issue with setting FILES_STORE? - scrapy

So I have a custom pipeline that extends Scrapy's current FilesPipeline. However, I'm having trouble with setting the FILES_STORE variable. My current file structure is:
my_scraper.py
files/
#this is where I want the files to download to
so, I set FILES_STORE=/files/ and run the spider. But when I do that I get the following error:
PermissionError: [Errno 13] Permission denied: '/files/'
Why does this happen? Is there anything that I am doing wrong?

If it's useful to anyone else, it was simple error - FILES_STORE requires the full path, not just the relative path from the folder.

Related

MWAA load custom file info to DAG

I'm trying to use a file in a DAG.
The codes I want to use is basically this:
conf_device_info = OmegaConf.load(f"./config/{dag_name}/config_dtype.json")
and my bucket is currently like this:
my-bucket
--/ dags
-- /config
--/{dag_name}
--/config_dtype.json
-- dag_with_the_code.py
-- /utils
--s3_manager.py
When I import s3_manager with "import utils.s3_manager" , it goes fine.
When I try to run the code the OmegaConf code, it says
FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/airflow/config/{dag_name}/config_dtype.json'
What should I do to do what I'm trying to acheive?
Why is import working and referencing file with absolute path not working..?
Thanks in advance.

AWS Lambda package-deployed functions require() of a relative path, not found

I have a zip file containing the following structure (this is the root of the archive, not nested in a top-level folder, which I understand is a common cause of errors for aws-s3-lambda deployments):
- support/
- shared.js
- one.js
- two.js
and then in one.js and two.js:
var shared = require("./support/shared");
// ...
When I run this code locally, it works. I use the aws-sdk to upload the zip file to AWS-S3 and then use aws.lambda.createFunction() to create a function with that name and handler and everything. The created function DOES show up in my Lambda dashboard, but when I test it, I get "Cannot find module './support/shared'". I have also tried var shared = require("./support/shared.js"); and that gives "Cannot find module './support/shared.js'".
This is for runtime node6.10. The filename cases are correct for case-sensitive lambda.
Shouldn't this work?? What's the gotcha?
Is there a way to verify the file structure that Lambda is working in to show that the additional ./support/shared.js file actually made it to the working directory or whatever it uses?
The gotcha is that the zip file created on a windows machine has the wrong chmod permissions set in it for when AWS unpacks it. The files are there, but inaccessible but node just gives a generic warning about not found instead of that the folder access is denied.

Can download but file will not unzip as expected

I'm attempting to access the Geometadb database which first involves download of the SQL library. I did that and then I got the Geometadb library.
library(GEOmetadb)
Next I need the Geometadb file which is where things start to go wrong. I issue this command as seen exactly in the tutorial: https://bioconductor.riken.jp/packages/3.0/bioc/vignettes/GEOmetadb/inst/doc/GEOmetadb.html
if(!file.exists('GEOmetadb.sqlite')) getSQLiteFile()
It should proceed to not only download a .gz zip file but also unzip the file. It downloads it but never unzips it. Instead I get the following error.
trying URL 'http://dl.dropbox.com/u/51653511/GEOmetadb.sqlite.gz'
Error in download.file(url_geo, destfile = localfile, mode = "wb") :
cannot open URL 'http://dl.dropbox.com/u/51653511/GEOmetadb.sqlite.gz'
In addition: Warning message:
In url(url_geo_2, open = "rb") :
cannot open: HTTP status was '403 Forbidden'
Just not sure what's going on here. Considering these are just the early tutorial steps I'm probably missing something really obvious but I'm hoping someone can help me out. Thanks!

Khan Academy API displayed on Geektool

I have been experimenting with the Khan Academy API found here
http://api-explorer.khanacademy.org/api/v1/user
and tried to find a way to display a user's points (and maybe some other information) on the desktop using geektool. I tried this
stackoverflow.com/questions/12514722/khan-academy-php-oauth-code
and
github.com/Khan/khan-api/
but nothing seems to work. The first link is the khan academy API provided as is. The second is someone with a similar problem who found a solution. He wrote a PHP script according to the temboo library and said to replace a few fields of the PHP and add both the PHP and the Temboo source code to the webroot. So, I added a folder called "php-sdk" into the webroot which is in /Library/WebServer/Documents/ and inside that folder was another folder "src" which contained the Khan Academy API and the Temboo library. Here is what I had.
cl.ly/image/2c2Z1B3T443L
Then I took a look at this and followed the steps until 6:19. Then I started the Apache server by entering this in terminal...
sudo apachectl restart
I opened a web browser, and typed in this...
localhost/php-sdk/src/khanAcademy.php
and I got this...
Warning: require(php-sdk/src/temboo.php): failed to open stream: No such file or directory in /Library/WebServer/Documents/php-sdk/src/khanAcademy.php on line 66
Fatal error: require(): Failed opening required 'php-sdk/src/temboo.php' (include_path='.:') in /Library/WebServer/Documents/php-sdk/src/khanAcademy.php on line 66
any ideas on what this could mean or how I could fix this? I am not advanced in PHP, or python, but I really would love to find a solution to this problem and I am willing to try anything that might work.
This error:
Warning: require(php-sdk/src/temboo.php): failed to open stream: No such file or directory in /Library/WebServer/Documents/php-sdk/src/khanAcademy.php on line 66
indicates that the path you're using for require is likely incorrect. Currently your PHP is trying to find a file called temboo.php here:
/Library/WebServer/Documents/php-sdk/src/php-sdk/src/temboo.php
Note the repeated directory structure. I'll make an assumption that your temboo.php is in the same directory as your khanAcademy.php file. In that case, simply change require "php-sdk/src/temboo.php" to require "temboo.php". If my assumption is incorrect, just adjust the include path accordingly.

SimplePie: Autoloader.php opening stream

Good afternoon, using Simplepie to load RSS feeds onto my site. Godaddy hosted site, with a WP blog for the RSS feed. Have read through Simplepie docs, and searched forums, but can't seem to figure this out. I'm wondering if my folder permissions aren't correct for the ../cache folder?
Error MSG:
Warning: require_once(../php/autoloader.php) [function.require-once]: failed to open stream: No such file or directory in D:\Hosting\12074013\html\test.php on line 15
Fatal error: require_once() [function.require]: Failed opening required '../php/autoloader.php' (include_path='.;C:\php\pear') in D:\Hosting\12074013\html\test.php on line 15
Thanks for your help!!!
It's not permissions on the cache directory, it's likely a problem with how you set up the SimplePie directories or your include statement.
The file you are running is in D:\Hosting\12074013\html\test.php
and it's trying to include the autoloader.php file, which the include
require_once(../php/autoloader.php)
thinks is in D:\Hosting\12074013\php, which is below the web root. Check your install path and set the include to the correct directory path.