Writing data to Excel give me 'ZIP does not support timestamps before 1980' - pandas

I hope to don't create any duplicate but I looked around (stack overflow and other forum) and I found some similar question but none of them solved my problem.
I have a python code that the only thing that does is query the DB, create a DataFrame in Pandas and write it to an Excel file.
The code worked without problem locally but when I introduced it in my server it start to give this error:
File "Test.py", line 34, in <module>
test()
File "Test.py", line 31, in test
ex.generate_file()
File "/home/carlo/Test/Utility/ExportExcell.py", line 96, in generate_file
writer.save()
File "/usr/local/lib/python2.7/dist-packages/pandas/io/excel.py", line 1952, in save
return self.book.close()
File "/usr/local/lib/python2.7/dist-packages/xlsxwriter/workbook.py", line 306, in close
self._store_workbook()
File "/usr/local/lib/python2.7/dist-packages/xlsxwriter/workbook.py", line 677, in _store_workbook
xlsx_file.write(os_filename, xml_filename)
File "/usr/lib/python2.7/zipfile.py", line 1135, in write
zinfo = ZipInfo(arcname, date_time)
File "/usr/lib/python2.7/zipfile.py", line 305, in __init__
raise ValueError('ZIP does not support timestamps before 1980')
ValueError: ZIP does not support timestamps before 1980
To ensure that is everything ok I printed my DataFrame and for me it looks good even because when I run it locally it geenrate an excell file without problem:
Computer_System_Memory_Size Count_of_HostName Disk_Total_Size Number_of_CPU OS_Family
0 5736053088256 70 6072238035456 282660 Windows
1 96159653888 607 96630589440 2451066 vCenter
2 0 9 0 36342 Virtualization
3 2469361287143424 37 2389533519619072 149406 Unix
4 3691651514368 90 5817485303808 363420 Linux
I don't see any timestamp here and this is part of my code:
pivot = pd.DataFrame.from_dict(pivot) #pivot= information extracted from DB
pd.to_numeric(pivot['Count_of_HostName'], downcast='signed')#try to enforce to be a numeric value in case it get confused with a datetime
pd.to_numeric(pivot['Disk_Total_Size'], downcast='signed')#try to enforce to be a numeric value in case it get confused with a datetime
pd.to_numeric(pivot['Computer_System_Memory_Size'], downcast='signed')#try to enforce to be a numeric value in case it get confused with a datetime
pd.to_numeric(pivot['Number_of_CPU'], downcast='signed')#try to enforce to be a numeric value in case it get confused with a datetime
print pivot
name = 'TempReport/Report.xlsx'#set-up file name
writer = pd.ExcelWriter(name, engine='xlsxwriter')#create excel with file name
pivot.to_excel(writer, 'Pivot', index=False)#introduce my data to excel
writer.save()#write to file, it's where it fail
Does someone know why it doesn't work in an Ubuntu 16.04 server without give me 'ZIP does not support timestamps before 1980' error?
I checked many things, library version, ensure that there are no data

XlsxWriter set the individual XML files that make up an XLSX file with a creation date of 1/1/1980 which is (I think) the ZIP epoch and the date used by Excel. This allows binary reproducibility of files created by XlsxWriter once the same input data and metadata is used.
It sets the date as follows (for the non-in-memory zipfile.py) case:
timestamp = time.mktime((1980, 1, 1, 0, 0, 0, 0, 0, 0))
os.utime(os_filename, (timestamp, timestamp))
The error that you are seeing occurs when this fails in some way and the date is set before 1/1/1980.
I've only seen this happen once before in a situation where the user was using a container and the container had a different time to the host system.
Do you have a situation like this or where the timestamp may be set incorrectly for some reason?
Update: Try run this in the same environment as the example that fails:
import os
import time
filename = 'file.txt'
file = open(filename, 'w')
file.close()
timestamp = time.mktime((1980, 1, 1, 0, 0, 0, 0, 0, 0))
os.utime(filename, (timestamp, timestamp))
print(time.ctime(os.path.getmtime(filename)))
# Should give:
# Tue Jan 1 00:00:00 1980
Update: This issue is fixed in XlsxWriter >= 1.1.9.

Try using this engine:
pd.to_excel('file_name.xlsx', engine = 'openpyxl')

This issue is fixed in XlsxWriter 1.2.1!

Related

What should the CSV file format look like for correct parsing with pandas?

I get an error parsing my CSV file that I want to convert to XLSX. The part of the file it complains about looks like this:
Number,2000-00-00
,,System,,,,,,,,,,,,System,
,,Type,,,,Type,,,,Type,,,,Type,,,,Type,,,,Type,
Settings, Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,
System,"dist, dist, dist, dist",0,104,21,65,0,128,5,29,0,62,0,1,0,11993,26,56,0,1321,14,18,0,63,0,0,
Number,2000-00-00
,,System,,,,,,,,,,,,System,
,,Type,,,,Type,,,,Type,,,,Type,,,,Type,,,,Type,
Settings, Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,Sample,
System,"dist, dist, dist, dist",0,1141,29,71,0,121,3,15,0,62,0,0,0,14034,22,47,0,84,1,10,0,80,0,0,
Total,,0,8436,13,62,0,839,24,163,0,451,0,2,0,97906,235,434,0,846,38,10,0,462,0,2,
Total 2,,,,,841,,,,8556,,,,453,,,,985,,,,869,,,,46
Total 3 ,,,,,,,,,,,,,350,,,,,,,,,,,,1078
Full,,2898
Example of error:
converter | Skipping line 3: expected 16 fields, saw 24
converter | Skipping line 4: expected 16 fields, saw 27
converter | Skipping line 5: expected 16 fields, saw 27
Screenshot
So I understand it's about a lot of delimiters, which are needed to form empty fields. Although pandas complains about the parsing error, the file itself looks correct when opened. Is it really about consecutive delimiters? If so, why does the file itself look correct? If not, what could be the problem?

redis.exceptions.DataError: Invalid input of type: 'NoneType'. Convert to a byte, string or number first

I've recently started to use Redis and RQ to run background processes. I built a Dash app which works fine on Heroku and used to work locally as well. Recently, I tried to test the same app locally again and I keep getting the following error - although I'm using exactly the same code hosted on Heroku:
redis.exceptions.DataError: Invalid input of type: 'NoneType'. Convert to a byte, string or number first.
In my requirements.txt and virtual env on Ubuntu 18.04 I have redis v.3.0.1, rq 0.13.0
When I run redis-server on my terminal I see that Redis 4.0.9 is used (that's also confusing to me).
I tried to google for two days looking for a solution with no avail.
Has anyone an idea of what might have happened and how to solve this error?
Here is the full relevant traceback:
File "/home/tom/dashenv/pb101_models/pages/cumulative_culture.py", line 1026, in stop_or_start_update
job = q.fetch_job(job_id)
File "/home/tom/dashenv/dash/lib/python3.6/site-packages/rq/queue.py", line 142, in fetch_job
self.remove(job_id)
File "/home/tom/dashenv/dash/lib/python3.6/site-packages/rq/queue.py", line 186, in remove
return self.connection.lrem(self.key, 1, job_id)
File "/home/tom/dashenv/dash/lib/python3.6/site-packages/redis/client.py", line 1580, in lrem
return self.execute_command('LREM', name, count, value)
File "/home/tom/dashenv/dash/lib/python3.6/site-packages/redis/client.py", line 754, in execute_command
connection.send_command(*args)
File "/home/tom/dashenv/dash/lib/python3.6/site-packages/redis/connection.py", line 619, in send_command
self.send_packed_command(self.pack_command(*args))
File "/home/tom/dashenv/dash/lib/python3.6/site-packages/redis/connection.py", line 659, in pack_command
for arg in imap(self.encoder.encode, args):
File "/home/tom/dashenv/dash/lib/python3.6/site-packages/redis/connection.py", line 124, in encode
"byte, string or number first." % typename)
redis.exceptions.DataError: Invalid input of type: 'NoneType'. Convert to a byte, string or number first.
Thanks in advance for any suggestion/hint.
All best,
Tom
Check this link: redis 3.0
It says that the redis-py no longer accepts NoneType
Try json.dumps(None), that worked for me

dask read_sql_table fails on sqlite table with numeric datetime

I've been given some large sqlite tables that I need to read into dask dataframes. The tables have columns with datetimes (ISO formatted strings) stored as sqlite NUMERIC data type. I am able to read in this kind of data using Pandas' read_sql_table. But, the same call from dask gives an error. Can someone suggest a good workaround? (I do not know of an easy way to change the sqlite data type of these columns from NUMERIC to TEXT.) I am pasting a minimal example below.
import sqlalchemy
import pandas as pd
import dask.dataframe as ddf
connString = "sqlite:///c:\\temp\\test.db"
engine = sqlalchemy.create_engine(connString)
conn = engine.connect()
conn.execute("create table testtable (uid integer Primary Key, datetime NUM)")
conn.execute("insert into testtable values (1, '2017-08-03 01:11:31')")
print(conn.execute('PRAGMA table_info(testtable)').fetchall())
conn.close()
pandasDF = pd.read_sql_table('testtable', connString, index_col='uid', parse_dates={'datetime':'%Y-%m-%d %H:%M:%S'})
pandasDF.head()
daskDF = ddf.read_sql_table('testtable', connString, index_col='uid', parse_dates={'datetime':'%Y-%m-%d %H:%M:%S'})
Here is the traceback:
Warning (from warnings module):
File "C:\Program Files\Python36\lib\site-packages\sqlalchemy\sql\sqltypes.py", line 596
'storage.' % (dialect.name, dialect.driver))
SAWarning: Dialect sqlite+pysqlite does *not* support Decimal objects natively, and SQLAlchemy must convert from floating point - rounding errors and other issues may occur. Please consider storing Decimal numbers as strings or integers on this platform for lossless storage.
Traceback (most recent call last):
File "<pyshell#2>", line 1, in <module>
daskDF = ddf.read_sql_table('testtable', connString, index_col='uid', parse_dates={'datetime':'%Y-%m-%d %H:%M:%S'})
File "C:\Program Files\Python36\lib\site-packages\dask\dataframe\io\sql.py", line 98, in read_sql_table
head = pd.read_sql(q, engine, **kwargs)
File "C:\Program Files\Python36\lib\site-packages\pandas\io\sql.py", line 416, in read_sql
chunksize=chunksize)
File "C:\Program Files\Python36\lib\site-packages\pandas\io\sql.py", line 1104, in read_query
parse_dates=parse_dates)
File "C:\Program Files\Python36\lib\site-packages\pandas\io\sql.py", line 157, in _wrap_result
coerce_float=coerce_float)
File "C:\Program Files\Python36\lib\site-packages\pandas\core\frame.py", line 1142, in from_records
coerce_float=coerce_float)
File "C:\Program Files\Python36\lib\site-packages\pandas\core\frame.py", line 6304, in _to_arrays
data = lmap(tuple, data)
File "C:\Program Files\Python36\lib\site-packages\pandas\compat\__init__.py", line 129, in lmap
return list(map(*args, **kwargs))
TypeError: must be real number, not str
EDIT: The comments by #mdurant make me wonder now if this is a bug in sqlalchemy. The following code gives the same error message as pandas does:
import sqlalchemy as sa
from sqlalchemy import text
m = sa.MetaData()
table = sa.Table('testtable', m, autoload=True, autoload_with=engine)
resultList = conn.execute(sa.sql.select(table.columns).select_from(table)).fetchall()
print(resultList)
resultList2 = conn.execute(sa.sql.select(columns=[text('uid'),text('datetime')], from_obj = text('testtable'))).fetchall()
print(resultList2)
Traceback (most recent call last):
File "<ipython-input-20-188c84a35d95>", line 1, in <module>
print(resultList)
File "c:\program files\python36\lib\site-packages\sqlalchemy\engine\result.py", line 156, in __repr__
return repr(sql_util._repr_row(self))
File "c:\program files\python36\lib\site-packages\sqlalchemy\sql\util.py", line 329, in __repr__
", ".join(trunc(value) for value in self.row),
TypeError: must be real number, not str
Puzzling.
Here is some further information, which hopefully can lead to an answer.
The query being execute at the line in question is
pd.read_sql(sql.select(table.columns).select_from(table),
engine, index_col='uid')
which fails as you show (the limit is not relevant here).
However, the text version of the same query
sql.select(table.columns).select_from(table).compile().string
-> 'SELECT testtable.uid, testtable.datetime \nFROM testtable'
pd.read_sql('SELECT testtable.uid, testtable.datetime \nFROM testtable',
engine, index_col='uid') # works fine
The following workaround, using a cast in the query, does work (but isn't pretty):
import sqlalchemy as sa
engine = sa.create_engine(connString)
table = sa.Table('testtable', m, autoload=True, autoload_with=engine)
uid, dt = list(table.columns)
q = sa.select([dt.cast(sa.types.String)]).select_from(table)
daskDF = ddf.read_sql_table(q, connString, index_col=uid.label('uid'))
-edit-
Simpler form of this that also appears to work (see comment)
daskDF = ddf.read_sql_table('testtable', connString, index_col='uid',
columns=['uid', sa.sql.column('datetime').cast(sa.types.String).label('datet‌​ime')])

How to Reset Date/Time M500 Sport DV Camera?

I recently bought a M500 Sport DV Cam. I am unable to reset/change Date and Time. According to Manual, Cam will create SportDV.txt file in SDCard and we can change Date Time from SportDV.txt file.
But My Cam is not creating any SportDV.txt file. It only creates Two folders Data (which contains an empty base.dat file) and DCIM (Which contains videos and Images).
I tried to create file Manually, but It doesn't change Date/Time. I also tried different methods like creating files with name times.txt, time.txt, timeset.txt, tag.txt, settime.txt but nothing works.
I am unable to change Date and Time. It always shows Year 2158 instead of 2015.
Sample Date: 2158/8/14 22:10:22
I tried everything and failed. But I found the solution.
Open Notepad and Copy & Paste
SPORTS DV
UPDATE:N
FORMAT
EV:6
CTST:100
SAT:100
AWB:0
SHARPNESS:100
AudioVol:1
QUALITY:0
LIGHTFREQ:0
AE:0
RTCDisplay:1
year:2014
month:7
date:7
hour:16
minute:11
second:0
-------------------------------
Exposure(EV)
0 ~ 12, def:6
Contrast(CTST)
1 ~ 200, def:100
Saturation(SAT)
1 ~ 200, def:100
White Balance(AWB)
0 ~ 3, def:0, 0(auto), 1(Daylight), 2(Cloudy), 3(Fluorescent)
Sharpness
1 ~ 200, def:100
AudioVol
0 ~ 2, def:1, 0:Max 1:Mid 2:Min
QUALITY
0 ~ 2, def:0, 0:High 1:Middle 2:Low
LIGHTFREQ
0 ~ 1, def:0, 0:60Hz 1:50Hz
AUTO EXPOSURE(AE)
0 ~ 2, def:0, 0:Average 1:Center 2:Spot
RTCDisplay
0 ~ 1, def:1, 0:Off 1:On
year
2012 - 2038, def:2013
month
01 - 12, def:1
date
01 - 31, def:1
hour
00 - 23, def:0
minute
01 - 59, def:0
second
01 - 59, def:0
Set Update:N to Update:Y,
Change year, month, date ,
and save the file with the name SportDV and Encoding to UTF-8
For versions that have a time.bat file putting a N at the end of the timestamp in the time.txt file removes the timestamp from the video, ie time.txt:
2015.11.13 20:13:31 N
i have the more recent version of the m500 mini camera that doesnt use the sportdv.txt file
It looks same physically as earlier one, same leds, same decals but it instead after being reset has a time.bat file in the root of the card. executing this on a windows machine produced a file called time.txt except the format of this batch file doesnt work,
i edited the time.txt file and restated the camera and it worked after following andys format from his posting on the dx.com site
choose edit and then make sure you replace the (probably nonsense format) contents with 2015.11.13 20:13:31 - in this case that's YYYY.MM.DD HH:MM:SS click save. turn off/eject the camera. Power up now not connected to PC and make a short capture. Now when you check the content the date/time will hopefully be right?
afaik there is no updated firmware for this version of the camera to change from 3 min files or hide the time/date text :-(

Django Oracle integrity error when saving any instance to database

I'm doing a migration from sqlite to oracle backend. The oracle database already exists and is maintained by other people. Its version is Oracle9i Enterprise Edition Release 9.2.0.1.0.
I have a simple model:
class AliasType(models.Model):
id = models.AutoField(primary_key=True, db_column="F_ALIAS_ID")
name = models.CharField(u"Type name", max_length=255, unique=True, db_column="F_ALIAS_NAME")
class Meta:
db_table = "ALIAS"
./manage.py syncdb does not return any errors. But when I try to create a new instance and save it to the database, I get the following error:
>>> AliasType.objects.create(name="test")
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "/mnt/Data/private/projects/envs/termary-oracle/src/django/django/db/models/manager.py", line 138, in create
return self.get_query_set().create(**kwargs)
File "/mnt/Data/private/projects/envs/termary-oracle/src/django/django/db/models/query.py", line 360, in create
obj.save(force_insert=True, using=self.db)
File "/mnt/Data/private/projects/envs/termary-oracle/src/django/django/db/models/base.py", line 460, in save
self.save_base(using=using, force_insert=force_insert, force_update=force_update)
File "/mnt/Data/private/projects/envs/termary-oracle/src/django/django/db/models/base.py", line 553, in save_base
result = manager._insert(values, return_id=update_pk, using=using)
File "/mnt/Data/private/projects/envs/termary-oracle/src/django/django/db/models/manager.py", line 195, in _insert
return insert_query(self.model, values, **kwargs)
File "/mnt/Data/private/projects/envs/termary-oracle/src/django/django/db/models/query.py", line 1435, in insert_query
return query.get_compiler(using=using).execute_sql(return_id)
File "/mnt/Data/private/projects/envs/termary-oracle/src/django/django/db/models/sql/compiler.py", line 791, in execute_sql
cursor = super(SQLInsertCompiler, self).execute_sql(None)
File "/mnt/Data/private/projects/envs/termary-oracle/src/django/django/db/models/sql/compiler.py", line 735, in execute_sql
cursor.execute(sql, params)
File "/mnt/Data/private/projects/envs/termary-oracle/src/django/django/db/backends/util.py", line 18, in execute
return self.cursor.execute(sql, params)
File "/mnt/Data/private/projects/envs/termary-oracle/src/django/django/db/backends/oracle/base.py", line 630, in execute
return self.cursor.execute(query, self._param_generator(params))
IntegrityError: ORA-01400: cannot insert NULL into ("SINCE"."ALIAS"."F_ALIAS_ID")
If I specify id, e.g. AliasType.objects.create(id=5, name="test"), it works. I thought django should be able to retrieve id value automatically. I've learnt that Oracle does not support autoincrement, and I should use triggers and sequences. I was told that there is an existing sequence in the database that returns ids for all new rows, and I know its name, say SEQ_GET_NEW_ID.
So the question is how to implement that in the most elegant way, i.e. how to tell Django to get id values for all new objects from the sequence named SEQ_GET_NEW_ID without hacking it too much (e.g. overriding save() methods for all models)?
There is a ticket open (#1946) to allow exactly that, overriding the default sequence name. But as it's not closed yet, I don't think there is a way without hacking.
I haven't used Oracle before, but a quick search suggests that it is possible to create aliases/synonyms for sequences. manage.py sqlall <app> should show you the sequence name Django is expecting. So you probably could just make this an alias for SEQ_GET_NEW_ID.