I have a csv file, the one from https://www.kaggle.com/jolasa/waves-measuring-buoys-data-mooloolaba/downloads/waves-measuring-buoys-data-mooloolaba.zip/1. The first entries look like this:
The first column has dates which I'm trying to read with this command:
matrix = dlmread ('waves-measuring-buoys-data/WavesMooloolabaJan2017toJun2019.csv',',',1,0);
(If referring to file on Kaggle, note that I slightly modified the directory and file names for ease of reading)
Then when I check a date by printing matrix(2,1), I get 1 instead of 01/01/2017 00:00.
How do I get the correct format?
csvread is only for numeric inputs.
Use csv2cell from the io package instead, to obtain your data as a string, and then perform any necessary string operatios and conversions accordingly.
Related
I'm currently doing this to generate a dataframe:
dataframe = pd.read_sql("select date_trunc('minute', log_time) as date, .....
my output is a time that looks like this:
"date":"2020-06-01 00:08:00.000"
What I want to do is have a time output that looks like this in the json file that it is outputted to:
"date":"2020-06-08T23:01:00.000Z
I found documents that show you how to remove it but not sure how to add it. do I have to do this after the dataframe is made or is there something in my date_trun( command that should put it in this format
Based off our conversation in the comments section, I have edited your question and added in the JSON file that it is outputted to to the line What I want to do is have a time output that looks like this in the JSON file that it is outputted to: At the end of the data, the only thing that matters is the raw value to be accurate in your JSON file. Don't worry about what it looks like in your Jupyter Notebook. I think this is a common mistake that people make and one that I have made in the past as well.
I would suggest not worrying about the datetime format in pandas. Just go with pandas default date/time until the very end.
THEN, as a final step, just before exporting to a JSON, change the format of the field to:
df['TIME'] = pd.to_datetime(df['TIME']).dt.strftime('%Y-%m-%dT%H:%M:%S.%f').str[:-3] + ['Z']
That will change to a format of 2020-06-08T23:01:00.000Z .
Note .str[:-3] is required because strftime doesn't support milliseconds (3 decimals) according to the documentation and only micorseconds (6 decimals). As such, you need to truncate the last 3 decimals to change to millisecond format.
That specific format is not directly supported with T and Z, so I did a little bit of string manipulation.
I have an excel created from a comma-delimited text file originally from a .sql file with an SQL INSERT query.
In one of the columns I have: "Cast(0x123456AB...) As TIME
Obviously this is NOT the jsondate format... so no help from that question...
I replaced the Cast( and replaced the ") As TIME" with empty strings.
So now I have the time values in hexadecimal.
How do I convert them into Excel Time or Datetime?
OK Playing around with it showed me that it's exactly the same as the jquery date answer. You take the numeric portion starting with 0x.
Take the 10 digits AFTER the 0x. e.g. in A2: =MID(A1, 3, 10)
Turn it into hexadecimal e.g. in A3: = HEX2DEC(A2)
Divide by 86400 e.g. A4: =A3/86400
And add the result to 1/1/1970 date. e.g. = A5: =A4 + Date(1970, 1, 1)
Or in short:
=(hex2dec(mid(a1,numstart,10))/86400) + date(1970,1,1)
Replace numstart with the 1-starting index of the number.
e.g. 3 if you have a 12 or 13 digit number like 0x12345678AB and you'll get 12345678AB
This is similar to the Convert JSON Date /Date(1388624400000)/ to Date in Excel
Except that:
a. The question was answered wrong and wouldn't work. (I edited it)
b. The .sql file was retrieved in a stored procedure from the database via SQL. While in the question they were using jquery returned ajax data, which seemed to differ. Turns out they're the same number with a different format.
As an added remark, I had a space mark at the beginning of my hex number. Until I did the MID on it, I didn't see that.
Note: When using ajax returned formatted dates like /date:0x12345678ab/ you'll set numstart to 8. If hex2dec fails, try turning the hex string into uppercase
before calling hex2dec. To debug just put each formula in a separate cell, so you see what works and what doesn't.
So I have a file on my AS400 as a result of DSPJRN and I want to look at some data in the JOESD field which is the after image from the journal of a file. This is defined as char with CCSID = 65535. I guess this is because it is the whole record with a mixture of ccsid and numeric fields.
I can use substr() to get the actual field from the original file. In the original file the column is defined graphic(10) ccsid 13488. Thats UCS-2. If I do hex(substr(joesd,522,20)) I get a result of 004100530044... and so on so I know it's the correct data but I can't get it to display as 'ASD...'
I tried graphic(substr(joesd,522,20),10,13488) but it gives an error that the conversion from ccsid 65535 to 13488 isn't valid. I don't want to convert it but interpret it as the other ccsid
GRAPHIC() doesn't take CCSID as a parm. The third parm is length according to my 7.1 reference.
What version are you using?
I thought CAST() might be a solution, but it doesn't appear to work.
As I see it, one option would be to build a user defined function (UDF) that does the conversion you need; possibly with the iconv() API.
The other option, would be to dump the data into a properly formatted file. I use the DBUJRN utility from DBU. There's other similar options. Including an open source one (sorry that the description is in German, but google translate does a good enough job to figure out the source to download).
The utilities basically work the same way; you can in fact run through the same process manually. Try the following:
Step 1 (the DSPJRN you've been doing)
DSPJRN <...> OUTFILE(MYLIB/MYJRNOUT)
Step 2 - Create a new file with the journal header fields followed by all the fields from your journaled file (MYFILE)
CREATE TABLE mylib/mytbl as
( select JOENTL, JOSEQN, JOCODE, JOENTT, JODATE,
JOTIME, JOJOB, JOUSER, JONBR, JOPGM, JOOBJ,
JOLIB, JOMBR, JOCTRR, JOFLAG, JOCCID,
JOINCDAT, JOMINESD, JORES,
m.*
from MYLIB/MYJRNOUT , MYLIB/MYFILE m
) with no data
Step 3 - Copy the data without regard to the format differences..
CPYF FROMFILE(MYLIB/MYJRNOUT) TOFILE(MYLIB/MYTBL) MBROPT(*ADD) FMTOPT(*NOCHK)
You should end up with data originally in JOESD split into it's appropriate fields.
Note of course that this technique only works for one file at a time. Also, make sure you're only dumping *RCD entries and you'll probably want to skip the DELETE entries.
I have created a Fortran 90 code to filter and convert the text output of another program in a csv form. The file contains a table with columns of various types (character, real, integer). There is a column that generally contains decimal values (probability values). BUΤ, in some rows, where the value should be decimal "1.000", the value is actually integer "1".
I use "F5.3" specifier to read this column and I have the same format statement for every row of the table. So, when the code finds "1", it reads ".001", because it does not find a decimal point.
What ways could I use to correctly (and generally) read integers among other decimals?
Could I specify "unformatted" input only for a number of "spaces"?
The data edit descriptor fw.d for floating point format specification is for input normally used with zero d (it cannot be ommited). Nonzero d is used in the rare case when the floating point data is stored as scaled integers, or you do some unit conversion from the integer values.
You could try using list-directed input: use a * instead of a format specifier. This would be for the entire read, not selected items. Or you could read the lines into a string test their contents to decide how to read them. If the sub-string has a decimal point: read (string(M:N), '(F5.3)') value. If it doesn't, use a different format, e.g., perhaps read as as F5.0.
P.S. "unformatted" is reading binary data without conversion ... it is a direct copy of the data from the file to the data item. "listed-directed" is the Fortran term for reading & converting data without using a format specification.
well here's someting new to me: f90 allows a mix of comma and space delimiters for a simple list directed read:
read(unit,*)v1,v2,v3,v4
with input
1.222 2 , 3.14 , 4
yields
1.222000 2.000000 3.140000 4.000000
I am writing out a series of files in a while loop, they are named
1.dat, 2.dat, 3.dat...
but I need the numbers to be formatted to -
00001.dat, 00002.dat, 00003.dat...
so that the file order is maintained when reading them back in. Is there a way to change the format of the while loop iterator?
Use the "format into string" node, it allows to pad integer with zeros: