Microsoft R- Tidyverse: If a data file fails to load, create an empty tibble/table in it's place

Microsoft R- Tidyverse: If a data file fails to load, create an empty tibble/table in it's place - tidyverse

I am really not sure how to phrase this concisely.. My question is: Is it possible to add an error handling feature so that if a data file (such as a csv) fails to load as a table/tibble, create a blank version of it?
Here is what I mean:
My normal csv load looks like this:
Monday2 <- paste0(my_file_location/my_file_name",Monday,".csv")
leads1 <- tibble(read.csv(Monday2))
Tuesday2 <- paste0("my_file_location/My_file_name",Tuesday,".csv")
leads2 <- tibble(read.csv(Tuesday2))
Wednesday2 <- paste0("my_file_location/my_file_name",Wednesday,".csv")
leads3 <- tibble(read.csv(Wednesday2))
If for some reason my csv failed to load (the file doesn't exist, or I entered the name incorrectly for example) can a blank version of it be created?
My idea for the blank tibble would look like this:
Leads21 <- tibble("Column1"= "", "Column2"= "", "Column3"= "")
Leads22 <- tibble("Column1"= "", "Column2"= "", "Column3"= "")
Leads23 <- tibble("Column1"= "", "Column2"= "", "Column3"= "")
This blank tibble would be the exact same columns as a properly loaded file. I have 5 files I bind each Friday in an automated process.. and if a file fails to load I can catch it downstream in my process (one of the columns is the file name/date) but I don't want the whole process to fail.
a typical 'failed to load' error looks like this:
In file(file, "rt") : cannot open file 'my_file_location/My_file_name_2022-03-27.csv': No such
file or directory
The bind of all 5 files then fails with an error message like:
### Join full weeks worth of leads into 1 file
Leads <- bind_rows(leads1,leads2,leads3, leads4, leads5)
Error in list2(...) : object 'leads1' not found
This then causes the rest of my code to fail/act incorrectly. If I can bind an empty tibble, my code could finish running and I can check for missing files at the end. Ultimately if a file is missing it is not as important as processing the existing files (so stopping my code to locate/fix the failed load is not important)
My background is in microsoft access VBA and I keep trying to write something like:
If tibble Leads1 exists, use it.. If tibble Leads1 does not exist use Leads21
not sure how to do this in R. I have been trying to read/understand the try() wrapper, but I don't understand how to use it in my case.

Related

How to save an updated fits file with headers in correct places?

I want to edit the data in my fits file using astropy and then save it to its original file. Below is my code and the error message, please ignore if there's a redundant line because obviously I opened the file twice but I still get the error after deleting it.
file_list = sorted(glob.glob('*.fits')) #read in my three fits files
hdudata = np.full((3,720,1440), 0) # a test list to store the data
for im in range(len(file_list)):
hdu_list = fits.open(file_list[im])
hdudata[im] = hdu_list[0].data # read in the data from fits file
if im == 2: # I only want to change the last image
with fits.open(file_list[im], mode='update') as hdus:
hdu = hdus[0]
hdu.data = (hdudata[im-1] + hdudata[im])/2. # basically add two images
# and take the average
hdu.close() # this is required otherwise an error message pops up saying
# the next line cannot proceed as the file is being run
hdu.flush() # the error line
VerifyError:
Verification reported errors:
HDU 0:
'NAXIS1' card at the wrong place (card 4).
'NAXIS2' card at the wrong place (card 5).
'EXTEND' card at the wrong place (card 6).
Note: astropy.io.fits uses zero-based indexing.
I have only accessed and changed the data but why is the error taking place in my header, I met no problem reading the headers (though I didn't include in this code above) then why is it faulty when saving it?

Issue automating CSV import to an RSQLite DB

I'm trying to automate writing CSV files to an RSQLite DB.
I am doing so by indexing csvFiles, which is a list of data.frame variables stored in the environment.
I can't seem to figure out why my dbWriteTable() code works perfectly fine when I enter it manually but not when I try to index the name and value fields.
### CREATE DB ###
mydb <- dbConnect(RSQLite::SQLite(),"")
# FOR LOOP TO BATCH IMPORT DATA INTO DATABASE
for (i in 1:length(csvFiles)) {
dbWriteTable(mydb,name = csvFiles[i], value = csvFiles[i], overwrite=T)
i=i+1
}
# EXAMPLE CODE THAT SUCCESSFULLY MANUAL IMPORTS INTO mydb
dbWriteTable(mydb,"DEPARTMENT",DEPARTMENT)
When I run the for loop above, I'm given this error:
"Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
cannot open file 'DEPARTMENT': No such file or directory
# note that 'DEPARTMENT' is the value of csvFiles[1]
Here's the dput output of csvFiles:
c("DEPARTMENT", "EMPLOYEE_PHONE", "PRODUCT", "EMPLOYEE", "SALES_ORDER_LINE",
"SALES_ORDER", "CUSTOMER", "INVOICES", "STOCK_TOTAL")
I've researched this error and it seems to be related to my working directory; however, I don't really understand what to change, as I'm not even trying to manipulate files from my computer, simply data.frames already in my environment.
Please help!

Simply use get() for the value argument as you are passing a string value when a dataframe object is expected. Notice your manual version does not have DEPARTMENT quoted for value.
# FOR LOOP TO BATCH IMPORT DATA INTO DATABASE
for (i in seq_along(csvFiles)) {
dbWriteTable(mydb,name = csvFiles[i], value = get(csvFiles[i]), overwrite=T)
}
Alternatively, consider building a list of named dataframes with mget and loop element-wise between list's names and df elements with Map:
dfs <- mget(csvfiles)
output <- Map(function(n, d) dbWriteTable(mydb, name = n, value = d, overwrite=T), names(dfs), dfs)

writeOGR error: creation of output file failed

I'm an R rookie and attempting to create home ranges from fish telemetry data using kernel density estimates within the adehabitatHR package
kud <- kernelUD(muskydetectdata.P[,6], h="href", extent = 5)
class(kud)
image(kud)
kud[[1]]#h
muskykud.P95 <- getverticeshr(kud, percent = 95)
muskykud.P95
muskykud.P50 <- getverticeshr(kud, percent = 50)
muskykud.P50
when exporting to a shapefile
writeOGR(muskydetectdata.sp,"musky_kde1", "gps",
driver="ESRI Shapefile",
dataset_options= "FieldName= id")
an error message is displayed
##creation of output file failed
I have also attempted to use writeSpatialShape with similar results
I'm using R version 3.3.2 on windows 64 bit

I had the same problem and have solved it only when I added a full name of my directory and a name of a layer plus a shp suffix:
writeOGR(muskydetectdata.sp, dsn="d:/your directory here/musky_kde.shp", layer="musky_kde", driver="ESRI Shapefile")

I had that same error.
I resolved mine by correcting the directory it was saving to (making sure it existed)
e.g.
writeOGR(muskydetectdata.sp, dsn = save.dir, layer = filename.save, driver = 'ESRI Shapefile')
where save.dir is the directory you want saved as a string and filename.save is the filename you want it saved as (excluding extension)

I guess you are trying to write on an existing file and the writeOGR function don't allow that. I guess this is a known behavior of some drivers supported by OGR (as far as I remember in R as in python and in the C API).
You have to check if the file exists prior to your writing and removing it (or changing the path you want to use).
For example here the first write operation succeed but the attempt to overwrite the file fails with your error message :
> rgdal::writeOGR(spdf, 'b.shp', layer="brazil", driver='ESRI Shapefile')
> rgdal::writeOGR(spdf, 'b.shp', layer="brazil", driver='ESRI Shapefile')
Error in rgdal::writeOGR(spdf, "b.shp", layer = "brazil", driver = "ESRI Shapefile") :
Creation of output file failed

Uploading job fails on the same file that was uploaded successfully before

I'm running regular uploading job to upload csv into BigQuery. The job runs every hour. According to recent fail log, it says:
Error: [REASON] invalid [MESSAGE] Invalid argument: service.geotab.com [LOCATION] File: 0 / Offset:268436098 / Line:218637 / Field:2
Error: [REASON] invalid [MESSAGE] Too many errors encountered. Limit is: 0. [LOCATION]
I went to line 218638 (the original csv has a headline, so I assume 218638 should be the actual failed line, let me know if I'm wrong) but it seems all right. I checked according table in BigQuery, it has that line too, which means I actually successfully uploaded this line before.
Then why does it causes failure recently?
project id: red-road-574
Job ID: Job_Upload-7EDCB180-2A2E-492B-9143-BEFFB36E5BB5

This indicates that there was a problem with the data in your file, where it didn't match the schema.
The error message says it occurred at File: 0 / Offset:268436098 / Line:218637 / Field:2. This means the first file (it looks like you just had one), and then the chunk of the file starting at 268436098 bytes from the beginning of the file, then the 218637th line from that file offset.
The reason for the offset portion is that bigquery processes large files in parallel in multiple workers. Each file worker starts at an offset from the beginning of the file. The offset that we include is the offset that the worker started from.
From the rest of the error message, it looks like the string service.geotab.com showed up in the second field, but the second field was a number, and service.geotab.com isn't a valid number. Perhaps there was a stray newline?
You can see what the lines looked like around the error by doing:
cat <yourfile> | tail -c +268436098 | tail -n +218636 | head -3
This will print out three lines... the one before the error (since I used -n +218636 instead of +218637), the one that had the error, and the next line as well.
Note that if this is just one line in the file that has a problem, you may be able to work around the issue by specifying maxBadRecords.

How to modify a line in a file with Erlang OTP module

I got a big file and I would like to replace the first line with other content.
When I use {ok, IoDev} = file:open("/root/FileName", [write, raw, binary]), the whole content is removed.
But when I use {ok, IoDev} = file:open("/root/FileName", [append, raw, binary]) and file:pwrite(S, {bof,0}, <<"new content\n">>), I got the result {error, badarg}.
If I set Location to 0: file:pwrite(S, 0, <<"new content\n">>), the string is appended at tail of the file.

You seem to be confused with the actual file API.
file:open/2 will truncate the file if you pass [write, raw, binary]as you do:
(about write mode): The file is opened for writing. It is created if it does not exist. If the file exists, and if write is not combined with read, the file will be truncated.
So you need to pass either [write, read] or [write, append] as documented.
file:pwrite/3 also works exactly as documented. It allows you to write at a given position in the file. In particular, you cannot pass {bof, 0} as second argument since you opened the file in raw mode:
If IoDevice has been opened in raw mode, some restrictions apply: Location is only allowed to be an integer; and the current position of the file is undefined after the operation.
The following sample code shows how they work:
ok = file:write_file("/tmp/file", "This is line 1.\nThis is line 2.\n"),
{ok, F} = file:open("/tmp/file", [read, write, raw, binary]),
ok = file:pwrite(F, 0, <<"This is line A.\n">>),
ok = file:close(F),
{ok, Content} = file:read_file("/tmp/file"),
io:put_chars(Content),
ok = file:delete("/tmp/file").
It will output:
This is line A.
This is line 2.
This works because text "This is line A.\n" is exactly as long as "This is line 1.\n". It does not really replace the line, but just bytes. If you need to replace the first line with content that has a different length, you need to rewrite the whole content of the file. A common approach is indeed to write a new file and swap them eventually. If the file is small enough, however, you can read it entirely in memory and rewrite it. file:read_file/1 and file:write_file/2 would work:
replace_first_line(Path, NewLine) ->
{ok, Content} = file:read_file(Path),
[FirstLine | Tail] = binary:split(Content, <<"\n">>),
NewContent = [NewLine, <<"\n">> | Tail],
ok = file:write_file(Path, NewContent).

The question is not related to erlang but rather general file operations.
Replacing a line in a file requires to rewrite the file in a whole. The easiest way to do so would be to write all the new content in a new file and then to move the file.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Microsoft R- Tidyverse: If a data file fails to load, create an empty tibble/table in it's place - tidyverse

Related

How to save an updated fits file with headers in correct places?

Issue automating CSV import to an RSQLite DB

writeOGR error: creation of output file failed

Uploading job fails on the same file that was uploaded successfully before

How to modify a line in a file with Erlang OTP module

Categories

Resources