Getting the root directory of a file in Google Drive using API - api

I need to get the full path of folders where a file is located in Google Drive. I'm getting the files themselves using the Google Drive API, but I need information about it's parent folders
I'm using the following code tothe the list of spreadsheets in a Shared Drive:
from googleapiclient import discovery
from httplib2 import Http
from oauth2client import file, client, tools
# Change the value of SCOPES to 'https://www.googleapis.com/auth/drive'
# if you want to be able to read and write to the user's Google Drive.
SCOPES = 'https://www.googleapis.com/auth/drive'
store = file.Storage('storage.json')
creds = store.get()
if not creds or creds.invalid:
flow = client.flow_from_clientsecrets('client_secret.json', SCOPES)
creds = tools.run_flow(flow, store)
DRIVE = discovery.build('drive', 'v3', http=creds.authorize(Http()))
folder_id = "1Z1GzY-D3I3qwQu3oxIW-L1a9nXgD0PXl"
query = "mimeType='application/vnd.google-apps.spreadsheet'"
query+= "and fullText contains 'CLAS' and trashed = false"
# query += " and parents in '" + folder_id + "'"
spreadsheets = []
# Initialize the page token
next_page_token = None
# Loop until all pages of results have been retrieved
while True:
# Execute the list request
response = DRIVE.files().list(
q=query,
corpora='drive',
includeItemsFromAllDrives=True,
driveId='0AEJNMySKcEzsUk9PVA',
supportsAllDrives=True,
# orderBy='folder',
pageSize=1000,
fields='nextPageToken, files(id, name, parents, mimeType, webViewLink)',
pageToken=next_page_token,
).execute()
# Append the results to the list
spreadsheets.extend(response.get('files', []))
# Check if there is another page of results
next_page_token = response.get('nextPageToken', None)
if next_page_token is None:
break
# Set the page token for the next iteration
# parameters['pageToken'] = next_page_token
# Print the number of results
print(f'Last spreadsheet found: {spreadsheets[-1]["name"]}. Number of spreadsheets: {len(spreadsheets)}')
This returns a list of dictionaries with the specified fields. I would like to know the names of the parent folders for each file, for which I'm trying:
from googleapiclient.errors import HttpError
for item in spreadsheets:
if 'parents' in item:
parent_folders_list = []
parent_id = item['parents'][0]
try:
while parent_id:
folder=DRIVE.files().get(fileId=parent_id, fields='name, id, parents').execute()
parent_folders_list.append(folder.get("parents", []))
if parent_id:
parent_id = parent_id[0]
except HttpError as error:
print('An error occurred: %s' % error)
print(f'{item["name"]} is in {parent_folders_list}')
And I've been able to identify that parent_id is correctly retrieved, and that I am able to access it, as I was able to open it in the browser. However, I get back errors 'File Not Found' for all parent_id. I wonder if the DRIVE.files().get(fileId=) is the correct way to get back a folder using the API.
Any help would be greatly appreciated.

Related

wait until the blob storoage folder is created

I would like to download a picture into a blob folder.
Before that I need to create the folder first.
Below codes are what I am doing.
The issue is the folder needs time to be created.
When it comes to with open(abs_file_name, "wb") as f:
it can not find the folder.
I am wondering whether there is an 'await' to get to know the completion of the folder creation, then do the write operation.
for index, row in data.iterrows():
url = row['Creatives']
file_name = url.split('/')[-1]
r = requests.get(url)
abs_file_name = lake_root + file_name
dbutils.fs.mkdirs(abs_file_name)
if r.status_code == 200:
with open(abs_file_name, "wb") as f:
f.write(r.content)
The final sub folder will not be created when using dbutils.fs.mkdirs() on blob storage.
It creates a file with the final sub folder name which would be considered as a directory, but it is not a directory. Look at the following demonstration:
dbutils.fs.mkdirs('/mnt/repro/s1/s2/s3.csv')
When I try to open this file, the error says that this is a directory.
This might be the issue with the code. So, try using the following code instead:
for index, row in data.iterrows():
url = row['Creatives']
file_name = url.split('/')[-1]
r = requests.get(url)
abs_file_name = lake_root + 'fail' #creates the fake directory (to counter the problem we are facing above)
dbutils.fs.mkdirs(abs_file_name)
if r.status_code == 200:
with open(lake_root + file_name, "wb") as f:
f.write(r.content)

Dropbox - Automatic Refresh token Using oauth 2.0 with offlineaccess

I now: the automatic token refreshing is not a new topic.
This is the use case that generate my problem: let's say that we want extract data from Dropbox. Below you can find the code: for the first time works perfectly: in fact 1) the user goes to the generated link; 2) after allow the app coping and pasting the authorization code in the input box.
The problem arise when some hours after the user wants to do the same operation. How to avoid or by-pass the newly generation of authorization code and go straight to the operation?enter code here
As you can see in the code in a short period is possible reinject the auth code inside the code (commented in the code). But after 1 hour or more this is not loger possible.
Any help is welcome.
#!/usr/bin/env python3
import dropbox
from dropbox import DropboxOAuth2FlowNoRedirect
'''
Populate your app key in order to run this locally
'''
APP_KEY = ""
auth_flow = DropboxOAuth2FlowNoRedirect(APP_KEY, use_pkce=True, token_access_type='offline')
target='/DVR/DVR/'
authorize_url = auth_flow.start()
print("1. Go to: " + authorize_url)
print("2. Click \"Allow\" (you might have to log in first).")
print("3. Copy the authorization code.")
auth_code = input("Enter the authorization code here: ").strip()
#auth_code="3NIcPps_UxAAAAAAAAAEin1sp5jUjrErQ6787_RUbJU"
try:
oauth_result = auth_flow.finish(auth_code)
except Exception as e:
print('Error: %s' % (e,))
exit(1)
with dropbox.Dropbox(oauth2_refresh_token=oauth_result.refresh_token, app_key=APP_KEY) as dbx:
dbx.users_get_current_account()
print("Successfully set up client!")
for entry in dbx.files_list_folder(target).entries:
print(entry.name)
def dropbox_list_files(path):
try:
files = dbx.files_list_folder(path).entries
files_list = []
for file in files:
if isinstance(file, dropbox.files.FileMetadata):
metadata = {
'name': file.name,
'path_display': file.path_display,
'client_modified': file.client_modified,
'server_modified': file.server_modified
}
files_list.append(metadata)
df = pd.DataFrame.from_records(files_list)
return df.sort_values(by='server_modified', ascending=False)
except Exception as e:
print('Error getting list of files from Dropbox: ' + str(e))
#function to get the list of files in a folder
def create_links(target, csvfile):
filesList = []
print("creating links for folder " + target)
files = dbx.files_list_folder('/'+target)
filesList.extend(files.entries)
print(len(files.entries))
while(files.has_more == True) :
files = dbx.files_list_folder_continue(files.cursor)
filesList.extend(files.entries)
print(len(files.entries))
for file in filesList :
if (isinstance(file, dropbox.files.FileMetadata)) :
filename = file.name + ',' + file.path_display + ',' + str(file.size) + ','
link_data = dbx.sharing_create_shared_link(file.path_lower)
filename += link_data.url + '\n'
csvfile.write(filename)
print(file.name)
else :
create_links(target+'/'+file.name, csvfile)
#create links for all files in the folder belgeler
create_links(target, open('links.csv', 'w', encoding='utf-8'))
listing = dbx.files_list_folder(target)
#todo: add implementation for files_list_folder_continue
for entry in listing.entries:
if entry.name.endswith(".pdf"):
# note: this simple implementation only works for files in the root of the folder
res = dbx.sharing_get_shared_links(
target + entry.name)
#f.write(res.content)
print('\r', res)

How to convert raw emails (MIME) from AWS SES to Gmail?

I have a gmail account linked to my domain account.
AWS SES will send messages to my S3 bucket. From there, SNS will forward the message in a raw format to my gmail address.
How do I automatically convert the raw message into a standard email format?
The raw message is in the standard email format. I think what you want to know is how to parse that standard raw email into an object that you can manipulate so that you can forward it to yourself and have it look like a standard email. AWS provides a tutorial on how to forward emails with a lambda function, through SES, by first storing them in your S3 bucket: https://aws.amazon.com/blogs/messaging-and-targeting/forward-incoming-email-to-an-external-destination/
If you follow those instructions, you'll find that the email you recieve comes as an attachment, not looking like a standard email. The following code is an alteration of the Python code provided by AWS that does what you're looking for (substitute this for the code provided in the tutorial):
# Copyright 2010-2019 Amazon.com, Inc. or its affiliates. All Rights Reserved.
# Altered from original by Adam Winter
#
# This file is licensed under the Apache License, Version 2.0 (the "License").
# You may not use this file except in compliance with the License. A copy of the
# License is located at
#
# http://aws.amazon.com/apache2.0/
#
# This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS
# OF ANY KIND, either express or implied. See the License for the specific
# language governing permissions and limitations under the License.
import os
import boto3
import email
import re
import html
from botocore.exceptions import ClientError
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from email.mime.application import MIMEApplication
from email.mime.image import MIMEImage
region = os.environ['Region']
def get_message_from_s3(message_id):
incoming_email_bucket = os.environ['MailS3Bucket']
incoming_email_prefix = os.environ['MailS3Prefix']
if incoming_email_prefix:
object_path = (incoming_email_prefix + "/" + message_id)
else:
object_path = message_id
object_http_path = (f"http://s3.console.aws.amazon.com/s3/object/{incoming_email_bucket}/{object_path}?region={region}")
# Create a new S3 client.
client_s3 = boto3.client("s3")
# Get the email object from the S3 bucket.
object_s3 = client_s3.get_object(Bucket=incoming_email_bucket,
Key=object_path)
# Read the content of the message.
file = object_s3['Body'].read()
file_dict = {
"file": file,
"path": object_http_path
}
return file_dict
def create_message(file_dict):
stringMsg = file_dict['file'].decode('utf-8')
# Create a MIME container.
msg = MIMEMultipart('alternative')
sender = os.environ['MailSender']
recipient = os.environ['MailRecipient']
# Parse the email body.
mailobject = email.message_from_string(file_dict['file'].decode('utf-8'))
#print(mailobject.as_string())
# Get original sender for reply-to
from_original = mailobject['Return-Path']
from_original = from_original.replace('<', '');
from_original = from_original.replace('>', '');
print(from_original)
# Create a new subject line.
subject = mailobject['Subject']
print(subject)
if mailobject.is_multipart():
index = stringMsg.find('Content-Type: multipart/')
stringBody = stringMsg[index:]
#print(stringBody)
stringData = 'Subject: ' + subject + '\nTo: ' + sender + '\nreply-to: ' + from_original + '\n' + stringBody
message = {
"Source": sender,
"Destinations": recipient,
"Data": stringData
}
return message
for part in mailobject.walk():
ctype = part.get_content_type()
cdispo = str(part.get('Content-Disposition'))
# case for each common content type
if ctype == 'text/plain' and 'attachment' not in cdispo:
bodyPart = MIMEText(part.get_payload(decode=True), 'plain', part.get_content_charset())
msg.attach(bodyPart)
if ctype == 'text/html' and 'attachment' not in cdispo:
mt = MIMEText(part.get_payload(decode=True), 'html', part.get_content_charset())
email.encoders.encode_quopri(mt)
del mt['Content-Transfer-Encoding']
mt.add_header('Content-Transfer-Encoding', 'quoted-printable')
msg.attach(mt)
if 'attachment' in cdispo and 'image' in ctype:
mi = MIMEImage(part.get_payload(decode=True), ctype.replace('image/', ''))
del mi['Content-Type']
del mi['Content-Disposition']
mi.add_header('Content-Type', ctype)
mi.add_header('Content-Disposition', cdispo)
msg.attach(mi)
if 'attachment' in cdispo and 'application' in ctype:
ma = MIMEApplication(part.get_payload(decode=True), ctype.replace('application/', ''))
del ma['Content-Type']
del ma['Content-Disposition']
ma.add_header('Content-Type', ctype)
ma.add_header('Content-Disposition', cdispo)
msg.attach(ma)
# not multipart - i.e. plain text, no attachments, keeping fingers crossed
else:
body = MIMEText(mailobject.get_payload(decode=True), 'UTF-8')
msg.attach(body)
# The file name to use for the attached message. Uses regex to remove all
# non-alphanumeric characters, and appends a file extension.
filename = re.sub('[^0-9a-zA-Z]+', '_', subject_original)
# Add subject, from and to lines.
msg['Subject'] = subject
msg['From'] = sender
msg['To'] = recipient
msg['reply-to'] = mailobject['Return-Path']
# Create a new MIME object.
att = MIMEApplication(file_dict["file"], filename)
att.add_header("Content-Disposition", 'attachment', filename=filename)
# Attach the file object to the message.
msg.attach(att)
message = {
"Source": sender,
"Destinations": recipient,
"Data": msg.as_string()
}
return message
def send_email(message):
aws_region = os.environ['Region']
# Create a new SES client.
client_ses = boto3.client('ses', region)
# Send the email.
try:
#Provide the contents of the email.
response = client_ses.send_raw_email(
Source=message['Source'],
Destinations=[
message['Destinations']
],
RawMessage={
'Data':message['Data']
}
)
# Display an error if something goes wrong.
except ClientError as e:
print('send email ClientError Exception')
output = e.response['Error']['Message']
else:
output = "Email sent! Message ID: " + response['MessageId']
return output
def lambda_handler(event, context):
# Get the unique ID of the message. This corresponds to the name of the file
# in S3.
message_id = event['Records'][0]['ses']['mail']['messageId']
print(f"Received message ID {message_id}")
# Retrieve the file from the S3 bucket.
file_dict = get_message_from_s3(message_id)
# Create the message.
message = create_message(file_dict)
# Send the email and print the result.
result = send_email(message)
print(result)
For those getting this error:
'bytes' object has no attribute 'encode'
In this line:
body = MIMEText(mailobject.get_payload(decode=True), 'UTF-8')
I could make it work. I am not an expert on this so the code might need some improvement. Also the email body includes html tags. But at least it got delivered.
If decoding the email still fails the error message will appear in your CloudWatch log. Also you will receive an email with the error message.
payload = mailobject.get_payload(decode=True)
try:
decodedPayload = payload.decode()
body = MIMEText(decodedPayload, 'UTF-8')
msg.attach(body)
except Exception as error:
errorMsg = "An error occured when decoding the email payload:\n" + str(error)
print(errorMsg)
body = errorMsg + "\nPlease download it manually from the S3 bucket."
msg.attach(MIMEText(body, 'plain'))
It is up to you which information you want to add to the error email like the subject or the from address.
Just another hint: With the above code you will get an error because subject_original is undefinded. Just delete the following lines.
# The file name to use for the attached message. Uses regex to remove all
# non-alphanumeric characters, and appends a file extension.
filename = re.sub('[^0-9a-zA-Z]+', '_', subject_original)
# Create a new MIME object.
att = MIMEApplication(file_dict["file"], filename)
att.add_header("Content-Disposition", 'attachment', filename=filename)
# Attach the file object to the message.
msg.attach(att)
As far as I understand this code is supposed to add the original email as an attachment which was not what I wanted.

Softlayer Object Storage Python API Search

I followed softlayer-object-storage-python in order to return a list of my objects matching a specific criteria.
This code seems to just return everything in my container no matter what I put into the search
sl_storage = object_storage.get_client(
username = environment['slos_username'],
password = environment['api_key'],
auth_url = environment['auth_url']
)
# get container
sl_container = sl_storage[environment['object_container']]
# get list, the search function doesn't actually work...
containers = sl_container.search("icm10restapi-qa.zip.*")
I expect only to get back things that start with icm10restapi-qa.zip.
I also tried using ^=icm10restapi-qa.zip but no luck either.
Reviewing the method, it seems that there is not possible to filter the objects as you would like:
https://github.com/softlayer/softlayer-object-storage-python/blob/master/object_storage/client.py#L147
API Operations for Search Services
My apologizes for the inconveniences, I recommended to try filter these in your code.
Updated
This script will help to filter your objects with the name which starts as specific string
import object_storage
import pprint
# Declare username, apikey and datacenter
USERNAME = 'set me'
API_KEY = 'set me'
DATACENTER = 'https://dal05.objectstorage.softlayer.net/auth/v1.0/'
# Creating object storage connection
sl_storage = object_storage.get_httplib2_client(USERNAME, API_KEY, auth_url=DATACENTER)
# Declare name to filter
name = 'icm10restapi-qa.zip'
# Filtering
containers = sl_storage.search(name)
for container in containers['results']:
if container.__dict__['name'].startswith(name):
print(container)

web2py rest api endpoint gives invalid path output

I have made a web2py web application. The api endpoints exposed are as follows.
"/comments[comments]"
"/comments/id/{comments.id}"
"/comments/id/{comments.id}/:field"
"/comments/user-id/{comments.user_id}"
"/comments/user-id/{comments.user_id}/:field"
"/comments/date-commented/{comments.date_commented.year}"
"/comments/date-commented/{comments.date_commented.year}/:field"
"/comments/date-commented/{comments.date_commented.year}/{comments.date_commented.month}"
"/comments/date-commented/{comments.date_commented.year}/{comments.date_commented.month}/:field"
"/comments/date-commented/{comments.date_commented.year}/{comments.date_commented.month}/{comments.date_commented.day}"
"/comments/date-commented/{comments.date_commented.year}/{comments.date_commented.month}/{comments.date_commented.day}/:field"
"/comments/date-commented/{comments.date_commented.year}/{comments.date_commented.month}/{comments.date_commented.day}/{comments.date_commented.hour}"
"/comments/date-commented/{comments.date_commented.year}/{comments.date_commented.month}/{comments.date_commented.day}/{comments.date_commented.hour}/:field"
"/comments/date-commented/{comments.date_commented.year}/{comments.date_commented.month}/{comments.date_commented.day}/{comments.date_commented.hour}/{comments.date_commented.minute}"
"/comments/date-commented/{comments.date_commented.year}/{comments.date_commented.month}/{comments.date_commented.day}/{comments.date_commented.hour}/{comments.date_commented.minute}/:field"
"/comments/date-commented/{comments.date_commented.year}/{comments.date_commented.month}/{comments.date_commented.day}/{comments.date_commented.hour}/{comments.date_commented.minute}/{comments.date_commented.second}"
"/comments/date-commented/{comments.date_commented.year}/{comments.date_commented.month}/{comments.date_commented.day}/{comments.date_commented.hour}/{comments.date_commented.minute}/{comments.date_commented.second}/:field"
"/comments/complaint-id/{comments.complaint_id}"
"/comments/complaint-id/{comments.complaint_id}/:field"
The comments model is as follows
models/db.py
db.define_table(
'comments',
Field('user_id', db.auth_user),
Field('comment_made', 'string', length=2048),
Field('date_commented', 'datetime', default=datetime.now),
Field('complaint_id', db.complaints),
Field('detailed_status', 'string', length=2048),
)
I have been successful in retriving a single comment via the following request
localhost:8000/api/comments/id/1.json
Now I wish to retrieve all the comments. I am not able to figure out how to use /comments[comments] to retrieve all comments.?
I have tried
localhost:8000/api/comments.json
But it gives an output with "invalid path"
I have realized requests such as http://localhost:8000/api/comments/complaint-id/1.json
also give "invalid path" as output.
Please help.
EDIT:
Controllers/default.py
#request.restful()
def api():
response.view='generic.' + request.extension
def GET(*args,**kargs):
patterns='auto'
parser = db.parse_as_rest(patterns,args,kargs)
if parser.status == 200:
return dict(content=parser.response)
else:
raise HTTP(parser.status,parser.error)
def POST(*args,**kargs):
return dict()
return locals()
routes.py in the main web2py folder to change the default application:
routers = dict(
BASE = dict(
default_application='GRS',
)
)
Another observation:
I added another endpoint as below:
def comm():
"""" Comments api substitute"""
rows=db().select(db.comments.ALL) ## this line shows error
# rows = db(db.comments.id > 0).select()
#rows=[[5,6],[3,4],[1,2]]
#for row in rows:
# print row.id
return dict(message=rows)
Even now I am not able to retrieve all comments with "/comm.json". This gives a web2py error ticket which says "need more than 1 value to unpack" on the line "rows=db.select(db.comments.ALL)". Are the above invalid path and this error related in someway?