Indices not rotated in SPHINX - sql

I want to use 2 indexes (querys), but sphinx gives this warning:
[sphinx#reea3 ~]$ /usr/bin/indexer --config /home/sphinx/sphinx/sphinx.conf --all --rotate
Sphinx 0.9.9-release (r2117)
Copyright (c) 2001-2009, Andrew Aksyonoff
using config file '/home/sphinx/sphinx/sphinx.conf'...
indexing index 'job1'...
collected 6 docs, 0.0 MB
sorted 0.0 Mhits, 100.0% done
total 6 docs, 837 bytes
total 0.005 sec, 152988 bytes/sec, 1096.69 docs/sec
indexing index 'job2'...
collected 8 docs, 0.0 MB
sorted 0.0 Mhits, 100.0% done
total 8 docs, 151 bytes
total 0.002 sec, 53794 bytes/sec, 2850.01 docs/sec
total 10 reads, 0.000 sec, 0.2 kb/call avg, 0.0 msec/call avg
total 20 writes, 0.000 sec, 0.3 kb/call avg, 0.0 msec/call avg
WARNING: failed to open pid_file '/var/run/sphinx/searchd.pid'.
WARNING: indices NOT rotated.
Below is the source code.
I have taken out the sql query and attribute list.
Whats to be done in order for SPHINX to rotate indices?
#
# Minimal Sphinx configuration sample (clean, simple, functional)
#
source jobSource1
{
type = mysql
sql_host = localhost
sql_user = root
# sql_pass = 123456
sql_pass =
sql_db = dbx
sql_port = 3306
sql_query = sql_query
sql_attr_uint = attributes go here
sql_attr_str2ordinal = attributes go here
}
source jobSource2
{
type = mysql
sql_host = localhost
sql_user = root
# sql_pass = 123456
sql_pass =
sql_db = dbx
sql_port = 3306
sql_query = sql_query
sql_attr_uint = attribute
sql_attr_str2ordinal = attribute
}
index job1
{
source = jobSource1
path = /home/sphinx/jobs/job1
docinfo = extern
charset_type = utf-8
}
index job2
{
source = jobSource2
path = /home/sphinx/jobs/job1
docinfo = extern
charset_type = utf-8
}
indexer
{
mem_limit = 32M
}
searchd
{
port = 9312
log = /var/log/sphinx/searchd.log
query_log = /var/log/sphinx/query.log
read_timeout = 5
max_children = 30
pid_file = /var/run/sphinx/searchd.pid
max_matches = 1000
seamless_rotate = 1
preopen_indexes = 0
unlink_old = 1
}

Actually, after a brainstorming seasion with my colleagues, we talked to the admins, and it turned out that the searchd process was initialized using the root priviledges, so only the root user could modify the file.
After restarting the process with the normal priviledges, everithing turned out to be working just fine;

use this command sudo indexer --rotate --all

Related

Pybullet on colab, cannot connect X server

I am using rl-baselines-zoo 3 to run ddpg with my custom env on colab. After I used show video function in that zoo repo, it said it cannot connect to the server. It works fine on other built-in envs, so I guess it's my env problem. please, need some help...
I set every thing from zoo's tutorials
Traceback:
pybullet build time: Jul 12 2021 20:46:20
/usr/local/lib/python3.7/dist-packages/gym/logger.py:30: UserWarning:
WARN: Box bound precision lowered by casting to float32
startThreads creating 1 threads.
starting thread 0
started thread 0
argc=2
argv[0] = --unused
argv[1] = --start_demo_name=Physics Server
ExampleBrowserThreadFunc started
X11 functions dynamically loaded using dlopen/dlsym OK!
X11 functions dynamically loaded using dlopen/dlsym OK!
Creating context
Created GL 3.3 context
Direct GLX rendering context obtained
Making context current
GL_VENDOR=VMware, Inc.
GL_RENDERER=llvmpipe (LLVM 10.0.0, 256 bits)
GL_VERSION=3.3 (Core Profile) Mesa 20.0.8
GL_SHADING_LANGUAGE_VERSION=3.30
pthread_getconcurrency()=0
Version = 3.3 (Core Profile) Mesa 20.0.8
Vendor = VMware, Inc.
Renderer = llvmpipe (LLVM 10.0.0, 256 bits)
b3Printf: Selected demo: Physics Server
startThreads creating 1 threads.
starting thread 0
started thread 0
MotionThreadFunc thread started
ven = VMware, Inc.
ven = VMware, Inc.
Wrapping the env in a VecTransposeImage.
tcmalloc: large alloc 3276800000 bytes == 0x556b03bda000 # 0x7f7cad04a001 0x7f7caa3f554f 0x7f7caa445b58 0x7f7caa449b17 0x7f7caa4e8203 0x556a81194d54 0x556a81194a50 0x556a81209105 0x556a812037ad 0x556a81196c9f 0x556a811d7d79 0x556a811d4cc4 0x556a81196ea1 0x556a81205bb5 0x556a8119630a 0x556a812087f0 0x556a812037ad 0x556a811963ea 0x556a8120460e 0x556a812034ae 0x556a811963ea 0x556a8120532a 0x556a812034ae 0x556a812031b3 0x556a81201660 0x556a81194b59 0x556a81194a50 0x556a81208453 0x556a812034ae 0x556a811963ea 0x556a812043b5
tcmalloc: large alloc 3276800000 bytes == 0x556bc78da000 # 0x7f7cad04a001 0x7f7caa3f554f 0x7f7caa445b58 0x7f7caa449b17 0x7f7caa4e8203 0x556a81194d54 0x556a81194a50 0x556a81209105 0x556a812037ad 0x556a81196c9f 0x556a811d7d79 0x556a811d4cc4 0x556a81196ea1 0x556a81205bb5 0x556a8119630a 0x556a812087f0 0x556a812037ad 0x556a811963ea 0x556a8120460e 0x556a812034ae 0x556a811963ea 0x556a8120532a 0x556a812034ae 0x556a812031b3 0x556a81201660 0x556a81194b59 0x556a81194a50 0x556a81208453 0x556a812034ae 0x556a811963ea 0x556a812043b5
/content/gdrive/My Drive/hsr/rl-baselines3-zoo/logs/ddpg/FoodHuntingHSR-v0_3/videos/final-model-ddpg-FoodHuntingHSR-v0-step-0-to-step-200.mp4
/usr/local/lib/python3.7/dist-packages/gym/logger.py:30: UserWarning:
WARN: Tried to pass invalid video frame, marking as broken: Your frame has data type float32, but we require uint8 (i.e. RGB values from 0-255).
Saving video to /content/gdrive/My Drive/hsr/rl-baselines3-zoo/logs/ddpg/FoodHuntingHSR-v0_3/videos/final-model-ddpg-FoodHuntingHSR-v0-step-0-to-step-200.mp4
numActiveThreads = 0
stopping threads
destroy semaphore
semaphore destroyed
Thread with taskId 0 exiting
Thread TERMINATED
destroy main semaphore
main semaphore destroyed
finished
numActiveThreads = 0
btShutDownExampleBrowser stopping threads
Thread with taskId 0 exiting
Thread TERMINATED
destroy semaphore
semaphore destroyed
destroy main semaphore
main semaphore destroyed
Exception ignored in: <function VecVideoRecorder.__del__ at 0x7f7c2b5cc200>
Traceback (most recent call last):
File "/content/gdrive/My Drive/hsr/stable-baselines3/stable_baselines3/common/vec_env/vec_video_recorder.py", line 114, in __del__
File "/content/gdrive/My Drive/hsr/stable-baselines3/stable_baselines3/common/vec_env/vec_video_recorder.py", line 110, in close
File "/content/gdrive/My Drive/hsr/stable-baselines3/stable_baselines3/common/vec_env/base_vec_env.py", line 278, in close
File "/content/gdrive/My Drive/hsr/stable-baselines3/stable_baselines3/common/vec_env/dummy_vec_env.py", line 67, in close
File "/content/gdrive/My Drive/hsr/stable-baselines3/stable_baselines3/common/monitor.py", line 113, in close
File "/usr/local/lib/python3.7/dist-packages/gym/core.py", line 243, in close
File "/usr/local/lib/python3.7/dist-packages/gym/core.py", line 243, in close
File "/content/gdrive/My Drive/hsr/PyLIS/gym-foodhunting/gym_foodhunting/foodhunting/gym_foodhunting.py", line 538, in close
pybullet.error: Not connected to physics server
class FoodHuntingEnv(gym.Env):
metadata = {'render.modes': ['human','rgb_array']}
GRAVITY = -10.0
BULLET_STEPS = 120 # p.setTimeStep(1.0 / 240.0), so 1 gym step == 0.5 sec.
def __init__(self, render=False, robot_model=R2D2, max_steps=500, num_foods=3, num_fakes=0, object_size=1.0, object_radius_scale=1.0, object_radius_offset=1.0, object_angle_scale=1.0):
"""Initialize environment.
"""
### gym variables
self.observation_space = robot_model.getObservationSpace() # classmethod
self.action_space = robot_model.getActionSpace() # classmethod
self.reward_range = (-1.0, 1.0)
self.seed()
### pybullet settings
self.ifrender = render
self.physicsClient = p.connect(p.GUI if self.ifrender else p.DIRECT)
p.setAdditionalSearchPath(pybullet_data.getDataPath())
### env variables
self.robot_model = robot_model
self.max_steps = max_steps
self.num_foods = num_foods
self.num_fakes = num_fakes
self.object_size = object_size
self.object_radius_scale = object_radius_scale
self.object_radius_offset = object_radius_offset
self.object_angle_scale = object_angle_scale
self.plane_id = None
self.robot = None
self.object_ids = []
### episode variables
self.steps = 0
self.episode_rewards = 0.0
def close(self):
"""Close environment.
"""
p.disconnect(self.physicsClient)
def reset(self):
"""Reset environment.
"""
self.steps = 0
self.episode_rewards = 0
p.resetSimulation()
# p.setTimeStep(1.0 / 240.0)
p.setGravity(0, 0, self.GRAVITY)
self.plane_id = p.loadURDF('plane.urdf')
self.robot = self.robot_model()
self.object_ids = []
for i, (pos, orn) in enumerate(self._generateObjectPositions(num=(self.num_foods+self.num_fakes), radius_scale=self.object_radius_scale, radius_offset=self.object_radius_offset, angle_scale=self.object_angle_scale)):
if i < self.num_foods:
urdfPath = 'food_sphere.urdf'
else:
urdfPath = 'food_cube.urdf'
object_id = p.loadURDF(urdfPath, pos, orn, globalScaling=self.object_size)
self.object_ids.append(object_id)
for i in range(self.BULLET_STEPS):
p.stepSimulation()
obs = self._getObservation()
#print('reset laile')
#self.robot.printAllJointInfo()
return obs
def step(self, action):
"""Apply action to environment, then return observation and reward.
"""
self.steps += 1
self.robot.setAction(action)
reward = -1.0 * float(self.num_foods) / float(self.max_steps) # so agent needs to eat foods quickly
for i in range(self.BULLET_STEPS):
p.stepSimulation()
reward += self._getReward()
self.episode_rewards += reward
obs = self._getObservation()
done = self._isDone()
pos, orn = self.robot.getPositionAndOrientation()
info = { 'steps': self.steps, 'pos': pos, 'orn': orn }
if done:
#print('Done laile')
info['episode'] = { 'r': self.episode_rewards, 'l': self.steps }
# print(self.episode_rewards, self.steps)
#print(self.robot.getBaseRollPosition(), self.robot.getTorsoLiftPosition(), self.robot.getHeadPosition(), self.robot.getArmPosition(), self.robot.getWristPosition(), self.robot.getGripperPosition()) # for HSR debug
#print(self.robot.getHeadPosition(), self.robot.getGripperPosition()) # for R2D2 debug
return obs, reward, done, info
def render(self, mode='human', close=False):
"""This is a dummy function. This environment cannot control rendering timing.
"""
if mode != 'rgb_array':
return np.array([])
return self._getObservation()
def seed(self, seed=None):
"""Set random seed.
"""
self.np_random, seed = seeding.np_random(seed)
return [seed]
def _getReward(self):
"""Detect contact points and return reward.
"""
reward = 0
contacted_object_ids = [ object_id for object_id in self.object_ids if self.robot.isContact(object_id) ]
for object_id in contacted_object_ids:
reward += 1 if self._isFood(object_id) else -1
p.removeBody(object_id)
self.object_ids.remove(object_id)
return reward
def _getObservation(self):
"""Get observation.
"""
obs = self.robot.getObservation()
return obs
def _isFood(self, object_id):
"""Check if object_id is a food.
"""
baseLink, urdfPath = p.getBodyInfo(object_id)
return urdfPath == b'food_sphere.urdf' # otherwise, fake
def _isDone(self):
"""Check if episode is done.
"""
#print(self.object_ids,'self')
available_object_ids = [ object_id for object_id in self.object_ids if self._isFood(object_id) ]
#print(available_object_ids)
return self.steps >= self.max_steps or len(available_object_ids) <= 0
def _generateObjectPositions(self, num=1, retry=100, radius_scale=1.0, radius_offset=1.0, angle_scale=1.0, angle_offset=0.5*np.pi, z=1.5, near_distance=0.5):
"""Generate food positions randomly.
"""
def genPos():
r = radius_scale * self.np_random.rand() + radius_offset
a = -np.pi * angle_scale + angle_offset
b = np.pi * angle_scale + angle_offset
ang = (b - a) * self.np_random.rand() + a
return np.array([r * np.sin(ang), r * np.cos(ang), z])
def isNear(pos, poss):
for p, o in poss:
if np.linalg.norm(p - pos) < near_distance:
return True
return False
def genPosRetry(poss):
for i in range(retry):
pos = genPos()
if not isNear(pos, poss):
return pos
return genPos()
poss = []
for i in range(num):
pos = genPosRetry(poss)
orn = p.getQuaternionFromEuler([0.0, 0.0, 2.0*np.pi*self.np_random.rand()])
poss.append((pos, orn))
return poss

Docplex ! Interrupt the execution

I execute a program in cplex python (docplex), it has arrived at a gap 48% with 41 solutions. it's already been 2 days, I ask if I can interrupt it and have a result without restarting the execution with limit gap.
If you run on Windows you could try CTRL C
If that does not work,
what you could do is run again your model with 1 new solution each time and then save the current solution and then each time you abort you have the latest solution
Example with the zoo story:
from docplex.mp.model import Model
from docplex.mp.progress import *
mdl = Model(name='buses')
nbbus40 = mdl.integer_var(name='nbBus40')
nbbus30 = mdl.integer_var(name='nbBus30')
mdl.add_constraint(nbbus40*40 + nbbus30*30 >= 300, 'kids')
mdl.minimize(nbbus40*500 + nbbus30*400)
mdl.parameters.mip.limits.solutions=1
while mdl.solve(log_output=False):
for v in mdl.iter_integer_vars():
print(v," = ",v.solution_value)
print("status : ",mdl.solve_details.status)
if ("optimal solution" in str(mdl.solve_details.status)):
break
that gives
nbBus40 = 8.0
nbBus30 = 0
status : solution limit exceeded
nbBus40 = 7.0
nbBus30 = 1.0
status : solution limit exceeded
nbBus40 = 6.0
nbBus30 = 2.0
status : integer optimal solution

Posting volume down/up events using Core Graphics [duplicate]

Is there a way to emulate key presses of the media keys (volume up/down, play, pause, prev, next) on common Apple notebooks?
How?
That took some time and many hacks (trying around with ctypes, the IOKit native interface, Quartz and/or Cocoa). This seems like an easy solution now:
#!/usr/bin/python
import Quartz
# NSEvent.h
NSSystemDefined = 14
# hidsystem/ev_keymap.h
NX_KEYTYPE_SOUND_UP = 0
NX_KEYTYPE_SOUND_DOWN = 1
NX_KEYTYPE_PLAY = 16
NX_KEYTYPE_NEXT = 17
NX_KEYTYPE_PREVIOUS = 18
NX_KEYTYPE_FAST = 19
NX_KEYTYPE_REWIND = 20
def HIDPostAuxKey(key):
def doKey(down):
ev = Quartz.NSEvent.otherEventWithType_location_modifierFlags_timestamp_windowNumber_context_subtype_data1_data2_(
NSSystemDefined, # type
(0,0), # location
0xa00 if down else 0xb00, # flags
0, # timestamp
0, # window
0, # ctx
8, # subtype
(key << 16) | ((0xa if down else 0xb) << 8), # data1
-1 # data2
)
cev = ev.CGEvent()
Quartz.CGEventPost(0, cev)
doKey(True)
doKey(False)
for _ in range(10):
HIDPostAuxKey(NX_KEYTYPE_SOUND_UP)
HIDPostAuxKey(NX_KEYTYPE_PLAY)
(While I needed this in Python for now, my question was not really Python related and of course you can easily translate that to any other language, esp. ObjC.)
Swift 5 / MacOS 10.14.4 / Xcode 10.2
#IBAction func mediaPressed(_ sender: AnyObject) {
let NX_KEYTYPE_SOUND_UP: UInt32 = 0
let NX_KEYTYPE_SOUND_DOWN: UInt32 = 1
let NX_KEYTYPE_PLAY: UInt32 = 16
let NX_KEYTYPE_NEXT: UInt32 = 17
let NX_KEYTYPE_PREVIOUS: UInt32 = 18
let NX_KEYTYPE_FAST: UInt32 = 19
let NX_KEYTYPE_REWIND: UInt32 = 20
func HIDPostAuxKey(key: UInt32) {
func doKey(down: Bool) {
let flags = NSEvent.ModifierFlags(rawValue: (down ? 0xa00 : 0xb00))
let data1 = Int((key<<16) | (down ? 0xa00 : 0xb00))
let ev = NSEvent.otherEvent(with: NSEvent.EventType.systemDefined,
location: NSPoint(x:0,y:0),
modifierFlags: flags,
timestamp: 0,
windowNumber: 0,
context: nil,
subtype: 8,
data1: data1,
data2: -1
)
let cev = ev?.cgEvent
cev?.post(tap: CGEventTapLocation.cghidEventTap)
}
doKey(down: true)
doKey(down: false)
}
for _ in 1...10 {
HIDPostAuxKey(key:NX_KEYTYPE_SOUND_UP)
}
HIDPostAuxKey(key:NX_KEYTYPE_PLAY)
}
Thank you Albert for that! I expanded on your script a bit to make it an executable that could in turn be called by Quicksilver or another launcher/trigger handler.
#!/usr/bin/python
# CLI program to control the mediakeys on OS X. Used to emulate the mediakey on a keyboard with no such keys.
# Easiest used in combination with a launcher/trigger software such as Quicksilver.
# Main part taken from http://stackoverflow.com/questions/11045814/emulate-media-key-press-on-mac
# Glue to make it into cli program by Fredrik Wallner http://www.wallner.nu/fredrik/
import Quartz
import sys
# NSEvent.h
NSSystemDefined = 14
# hidsystem/ev_keymap.h
NX_KEYTYPE_SOUND_UP = 0
NX_KEYTYPE_SOUND_DOWN = 1
NX_KEYTYPE_PLAY = 16
NX_KEYTYPE_NEXT = 17
NX_KEYTYPE_PREVIOUS = 18
NX_KEYTYPE_FAST = 19
NX_KEYTYPE_REWIND = 20
supportedcmds = {'playpause': NX_KEYTYPE_PLAY, 'next': NX_KEYTYPE_NEXT, 'prev': NX_KEYTYPE_PREVIOUS, 'volup': NX_KEYTYPE_SOUND_UP, 'voldown': NX_KEYTYPE_SOUND_DOWN}
def HIDPostAuxKey(key):
def doKey(down):
ev = Quartz.NSEvent.otherEventWithType_location_modifierFlags_timestamp_windowNumber_context_subtype_data1_data2_(
NSSystemDefined, # type
(0,0), # location
0xa00 if down else 0xb00, # flags
0, # timestamp
0, # window
0, # ctx
8, # subtype
(key << 16) | ((0xa if down else 0xb) << 8), # data1
-1 # data2
)
cev = ev.CGEvent()
Quartz.CGEventPost(0, cev)
doKey(True)
doKey(False)
if __name__ == "__main__":
try:
command = sys.argv[1]
assert(command in supportedcmds)
HIDPostAuxKey(supportedcmds[command])
except (IndexError, AssertionError):
print "Usage: %s command" % (sys.argv[0],)
print "\tSupported commands are %s" % supportedcmds.keys()
The script can be found at https://gist.github.com/4078034

How to improve get object size with python and boto3?

I use Cloudian Storage on premise with S3 API.
I need to monitor the used size of a bucket without Cloudian Admin Access.
With AWS CLI I use:
./aws --endpoint-url=https://s3-edc.emea.svc.corpintra.net:443 s3api list-objects --bucket edcs3mposdocifyb --output json --query "{\"size\": sum(Contents[].Size), \"objects\": length(Contents[])}"
This takes around 3 Minutes with following result:
{
"size": 216317367311,
"objects": 756771
}
I tried to get the same information with following python3 script using boto3.
import boto3
total_bucket_size = 0
total_bucket_objects = 0
s3 = boto3.resource('s3', aws_access_key_id="****", aws_secret_access_key="***", endpoint_url="https://my.cloudian.fqdn:443", verify="MyChain.cer")
bucket = s3.Bucket("mybucketname")
bucket_name = bucket.name
for obj in bucket.objects.all():
obj_key = obj.key
bucket_object = s3.Object(bucket_name, obj_key)
obj_size = int(bucket_object.content_length)
total_bucket_size += obj_size
total_bucket_objects += 1
print("%010d %s -> %d" %(total_bucket_objects,obj_key,obj_size))
print("Total size: %d" % total_bucket_size)
But this code will run some hours.
The goal is to write the result to an influxdb. It is quite easy with InfluxDBClient for python.
Any I idea why my boto3 code takes so long?
What can I change to speed up the code?
I found a way to reduce the time the python script uses to 4 minutes.
total_bucket_size = 0
total_bucket_objects = 0
s3 = boto3.resource('s3', aws_access_key_id="13c81dba2e4e78628c76", aws_secret_access_key="zLmJVNVx03BQaUokmu6bSROskArFKROhwVyoOdcT", endpoint_url="https://s3-edc.emea.svc.corpintra.net:443", verify="DaimlerChain.cer")
bucket = s3.Bucket("edcs3mposdocifyb")
bucket_name = bucket.name
for obj in bucket.objects.all():
obj_key = obj.key
#bucket_object = s3.Object(bucket_name, obj_key)
#obj_size = int(bucket_object.content_length)
obj_size = obj.size
total_bucket_size += obj_size
total_bucket_objects += 1
print("%010d %s -> %d" %(total_bucket_objects,obj_key,obj_size))
print("Total size: %d" % total_bucket_size)

How can I get the frames per sec from the header of an avi xvid video file?

The following is a hex-ascii dump of the first 300 bytes of an avi xvid file:
|52|49|46|46|8c|79|95|00|41|56|49|20|4c|49|53|54|94|6a|00|00 RIFF.y..AVI LIST.j..
|68|64|72|6c|61|76|69|68|38|00|00|00|40|9c|00|00|00|ec|36|00 hdrlavih8...#.....6.
|00|00|00|00|10|00|00|00|81|01|00|00|00|00|00|00|01|00|00|00 ....................
|df|9e|01|00|d0|02|00|00|40|02|00|00|00|00|00|00|00|00|00|00 ........#...........
|00|00|00|00|00|00|00|00|4c|49|53|54|3c|69|00|00|73|74|72|6c ........LIST<i..strl
|73|74|72|68|38|00|00|00|76|69|64|73|78|76|69|64|00|00|00|00 strh8...vidsxvid....
|00|00|00|00|00|00|00|00|e8|03|00|00|a8|61|00|00|00|00|00|00 .............a......
|81|01|00|00|df|9e|01|00|10|27|00|00|00|00|00|00|00|00|00|00 .........'..........
|d0|02|40|02|73|74|72|66|28|00|00|00|28|00|00|00|d0|02|00|00 ..#.strf(...(.......
|40|02|00|00|01|00|18|00|58|56|49|44|00|f8|25|00|00|00|00|00 #.......XVID..%.....
|00|00|00|00|00|00|00|00|00|00|00|00|73|74|72|64|c0|28|00|00 ............strd.(..
|00|00|00|00|bc|02|00|00|90|b2|08|00|2e|5c|76|69|64|65|6f|2e .............\video.
|70|61|73|73|00|00|2e|00|70|00|61|00|73|00|73|00|00|00|00|00 pass....p.a.s.s.....
|00|00|00|00|00|00|00|00|00|00|00|00|00|00|00|00|00|00|00|00 ....................
|00|00|00|00|00|00|00|00|00|00|00|00|00|00|00|00|00|00|00|00 ....................
In this particular case I know that the video has a frame rate of 25 frames per second. How can I use this dump to obtain the frame rate? Note, it is straightforward to obtain the frame rate for an avi msvc video file since the data structure is well documented. However, I have yet to find any document that defines the data structure for an avi xvid file.
The following is an improved version (IMHO) of my previous answer. This code finds 5 key features of AVI files.
"""
Purpose: To extract 5 important features from AVI files: time between frames,
frames per second, number of frames, width and height of each frame.
Author: V. Stokes (vs#it.uu.se)
Version: 2014.01.01.1
Note:
1. DT (time between samples) in microseconds, bytes: 32,33,34,35
of all AVI files that I tested. Byte 0 is the first byte in the AVI file.
2. SR (sample rate) is equivalent to fps (frames per second) and can be
calculated from the inverse of DT.
3. NF (number of frames), bytes: 48,49,50,51.
4. WD (width of each frame) in pixels, bytes: 64,65,66,67
5. HT (height of each frame) in pixels, bytes: 68,69,70,71
6. This python script works on all the AVI files that I have tested (so far),
which suggests there is some consistency in the different AVI formats.
"""
import os
# laptop desktop
dirpaths = [r'D:/Videos/', r'C:/Videos/']
for path in dirpaths:
if os.path.exists(path):
defDir = path
## All the following videos were "test AVI video downloads" from the Web
#video = 'julius.avi' #CRAM ( 56 frames,320x240,15fps, 66667 microseconds between frames)
#video = 'flyby-xvid.avi' #DIVX (495 frames,320x240,30fps, 33333 microseconds between frames)
#video = 'flyby-xvid-realtime.avi' #DIVX (495 frames,320x240,30fps, 33333 microseconds between frames)
#video = 'flyby-xvid-q5.avi' #DIVX
#video = 'flyby-xvid-mpeg.avi' #DIVX
#video = 'flyby-xvid-mpeg-q2.avi' #DIVX
#video = 'flyby-xvid-default.avi' #DIVX
#video = 'flyby-xvid-2p.avi' #DIVX
#video = 'flyby-divx.avi' #DX50
#video = 'ubAVIxvid10.avi' #XVID (849 frames,640x480,10fps,100000 microseconds between frames)
#video = 'toy_plane_liftoff.mp4' # This is not an AVI and doesn't work
video = 'toy_plane_liftoff.avi' #MJPG (100 frames,640x480,25fps, 40000 microsseconds between frames)
inFile = defDir + video
print 'Input file: %s' %inFile
data = [0,0,0,0]
with open(inFile,'r') as f:
f.seek(32,0) # start reading offset byte 32
data[0] = f.read(4) # get 4 bytes (32,33,34,35) -- DT
f.seek(48,0)
data[1] = f.read(4) # get 4 bytes (48,49,50,51) -- NF
f.seek(64,0)
data[2] = f.read(4) # get 4 bytes (64,65,66,67) -- Width
data[3] = f.read(4) # get 4 bytes (68,69,70,71) -- Height
f.close()
def bufferToHex(num):
for k in range(num):
accumulator = ''
buffer = data[k]
for item in range(4):
accumulator += '%s' % buffer[item]
data[k] = accumulator
# Get 4 hex strings (4 bytes in each string)
bufferToHex(4)
for k in range(4): # loop on DT,NF,WD,HT
prev_kvs = ''
for hexval in data[k]: # loop on each group of 4 bytes
strr = hexval.encode('hex') # hexidecimal string of length 2
kvs = strr + prev_kvs
prev_kvs = kvs
intVal = int(kvs,16)
if k == 0:
# DT
print 'DT = %6d (microseconds)' %intVal
print 'SR = %6d (frames per second)' %round(10.0**6/intVal)
elif k == 1:
# NF
print 'NF = %6d (number of frames)' %intVal
elif k == 2:
# Width
print 'Width = %6d (width in pixels)' %intVal
else:
# Height
print 'Height = %6d (height in pixels)' %intVal
The following Python script (Python 2.7, Windows Vista-32 bit) can be used to extract the time between frames from AVI files and in turn the frame rate.
"""
Purpose: To extract the time between frames from AVI files
Author: V. Stokes (vs#it.uu.se)
Version: 2013.12.31.1
Note:
1. DT (time between samples) in microseconds is stored in bytes: 32,33,34,35
of all AVI files that I tested. Byte 0 is the first byte in the AVI file.
2. SR (sample rate) is equivalent to fps (frames per second) and can be
calculated from the inverse of DT.
3. DT is stored as 4 bytes (long C integer) which are in reverse order.
4. This python script works on all the AVI files that I have tested (so far),
which suggests there is some consistency in the different AVI formats.
5. All of the AVI files used for testing can be obtained from me via
an email request.
"""
import os
# laptop desktop
dirpaths = [r'D:/Videos/', r'C:/Videos/']
for path in dirpaths:
if os.path.exists(path):
defDir = path
#
## All of the following files were downloaded from different web "test" AVI file sites
## Bytes: 32,33,34,35 = 35,82,00,00
#video = 'flyby-xvid.avi' #DIVX (495 frames,320x240,30fps, 33333 microseconds between frames)
#video = 'flyby-xvid-realtime.avi' #DIVX (495 frames,320x240,30fps, 33333 microseconds between frames)
#video = 'flyby-xvid-q5.avi' #DIVX
#video = 'flyby-xvid-mpeg.avi' #DIVX
#video = 'flyby-xvid-mpeg-q2.avi' #DIVX
#video = 'flyby-xvid-default.avi' #DIVX
#video = 'flyby-xvid-2p.avi' #DIVX
## Bytes: 32,33,34,35 = 35,82,00,00
#video = 'flyby-divx.avi' #DX50
## Bytes: 32,33,34,35 = a0,86,01,00
#video = 'ubAVIxvid10.avi' #XVID (849 frames,640x480,10fps,100000 microseconds between frames)
#video = 'toy_plane_liftoff.mp4' # This is not an AVI and doesn't work!
#video = 'toy_plane_liftoff.avi' #MJPG (100 frames,640x480,25fps, 40000 microsseconds between frames)
## Bytes: 32,33,34,35 = 6b,04,01,00
video = 'julius.avi' #CRAM ( 56 frames,320x240,15fps, 66667 microseconds between frames)
inFile = defDir+video
print 'Input file: %s' %inFile
with open(inFile,'r') as f:
# Get time between frames (in hex format)
f.seek(32,0) # start reading at byte 32
data = f.read(4) # get 4 bytes (32,33,34,35)
f.close()
def bufferToHex(buffer, start, count):
accumulator = ''
for item in range(count):
accumulator += '%s' % buffer[start + item]
return accumulator
hexc = bufferToHex(data, 0, 4)
timeBetweenFrames = 0
k = -2
for hexval in hexc:
k += 2 # two hex characters
kvs = hexval.encode('hex') # hexidecimal string of length 2
kv = int(kvs,16) # convert to integer base-16 (hexidecimal)
kp = 16**k
timeBetweenFrames += kv*kp
print 'DT = %6d (microseconds)' %timeBetweenFrames
print 'SR = %6d (frames per second)' %round(10.0**6/timeBetweenFrames)