I just started learning Python and now I'm trying to integrate that with my GIS knowledge. As the title suggests, I'm attempting to convert an Excel sheet to a table but I keep getting errors, one which is wholly undecipherable to me and the other which seems to be suggesting that my file does not exist which, I know is incorrect since I copied it's location directly from it's properties.
Here is a screenshot of my environment. Please help if you can and thanks in advance.
Environment/Error
Simply set, you put the workspace directory inside the filename variable so when arcpy handles it, it tries to acess a file that does not exist, in an unknown workspace.
Try this.
arcpy.env.workspace = "J:\egis_work\dpcd\projects\SHARITA\Python\"
arcpy.ExcelToTable_conversion("Exceltest.xlsx", "Bookstorestable", "Sheet1")
Arcpy uses the following syntax to convert geodatabase tables to excel
It is straight forward.
Example
Excel tables cannot be stored in the geodatabase. Most reasonable thing is to store them in the rootfolder in which the geodatabase with the table is. Say I want to convert table below into excel and save it in the root folder or in the folder in which the geodatabase is.
I will go as follows: I have put the explanations after the #.
import arcpy
import os
from datetime import datetime, date, time
# Set environment settings
in_table= r"C:\working\Sunderwood\Network Analyst\MarchDistances\Centroid.gdb\SunderwoodFirstArcpyTable"
#os.path.basename(in_table)
out_xls= os.path.basename(in_table)+ datetime.now().strftime('%Y%m%d') # Here
#os.path.basename(in_table)- Gives the base name of pathname. In this case, it returns the name table
# + is used in python to concatenate
# datetime.now()- gives todays date
# Converts todays date into a string in the format YYYMMDD
# Please add all the above statements and you notice you have a new file name which is the table you input plus todays date
#os.path.dirname() method in Python is used to get the directory name from the specified path
geodatabase = os.path.dirname(in_table)
# In this case, os.path.dirname(in_table) gives us the geodatabase
# The The join() method takes all items in an iterable and joins them into one string
SaveInFolder= "\\".join(geodatabase.split('\\')[:-1])
# This case, I tell python take \ and join on the primary directory above which I have called geodatabase. However, I tell it to remove some characters. I will explain the split below.
# I use split method. The split() method splits a string into a list
#In the case above it splits into ['W:\\working\\Sunderwood\\Network', 'Analyst\\MarchDistances\\Centroid.gdb']. However, that is not what I want. I want to remove "\\Centroid.gdb" so that I remain with the follwoing path ['W:\\working\\Sunderwood\\Network', 'Analyst\\MarchDistances']
#Before I tell arcpy to save, I have to specify the workspace in which it will save. So I now make my environment the SaveInFolder
arcpy.env.workspace =SaveInFolder
## Now I have to tell arcpy what I will call my newtable. I use os.path.join.This method concatenates various path components with exactly one directory separator (‘/’) following each non-empty part except the last path component
newtable = os.path.join(arcpy.env.workspace, out_xls)
#In the above case it will give me "W:\working\Sunderwood\Network Analyst\MarchDistances\SunderwoodFirstArcpyTable20200402"
# You notice the newtable does not have an excel extension. I resort to + to concatenate .xls onto my path and make it "W:\working\Sunderwood\Network Analyst\MarchDistances\SunderwoodFirstArcpyTable20200402.xls"
table= newtable+".xls"
#Finally, I call the arcpy method and feed it with the required variables
# Execute TableToExcel
arcpy.TableToExcel_conversion(in_table, table)
print (table + " " + " is now available")
Related
I am programming (Pandas) around a problem where certain generated files are saved with a date attached to the file. For example: file-name_20220814.csv.
However, these files change each time they are generated, creating a new ending to the file. What is the best way to use a wildcard to stand for these file date endings?
Glob? How would I do that in the following code:
df1 = pd.read_csv('files/file-name_20220816.csv')
Answer provided by #mitoRibo:
pd.read_csv(glob('files/file-name_*csv')[0])
There are sevaral source files in VHDL. All files have a header which gives the file name, creation date and description among other things. One of these things is the last update date. All files are version controlled in Git.
What happens is that often the files are modified, commited and pushed up. However, the last update date is not updated often. This happens by mistake since so many different files are worked on at different times and one might forget to always change the "last update" part of the file header to the latest date when it has actually been changed.
I want to automate this process and believe there are many different ways to do this.
A script of some sort, must check the last update date in the text file header. Then, if it is different from the actual last modified date that can be accessed through properties of the file in the file-system, the last update date in the text must be updated to the last modified date value. What would be the most optimal way to do this? A Python script, Bash script or something else?
Basically I want to do this when the files are being commited into Git. It should ideally happen automatically but running one line in terminal to execute script is not a big deal perhaps. The check is required on the files that are being commited and pushed up.
I'm not a Python programmer, but I made a little script to hopefully help you out. Maybe this fits your needs.
What the script should do:
Get all files form the path (here c:\Python) which have the extension .vdhl
Loop over the files and extract the date from line 9 via regex
Get the last modified date from the file
If last modified > then the date in the file, then update the file
import os
import re
import glob
import datetime
path = r"c:\Python"
mylist = [f for f in glob.glob("*.vhdl")]
print(mylist)
for i in mylist:
filepath = os.path.join(path, i)
with open(filepath, 'r+') as f:
content = f.read()
last_update = re.findall("Last\supdate\:\s+(\d{4}-\d{2}-\d{2})", content)
modified = os.path.getmtime(filepath)
modified_readable = str(datetime.datetime.fromtimestamp(modified))[:10]
#print(content)
#print(last_update)
#print(modified_readable)
#print("Date modified:", datetime.datetime.fromtimestamp(modified))
if (modified_readable > last_update[0]):
print(filepath, 'UPDATE')
text = re.sub(last_update[0], modified_readable, content)
f.seek(0)
f.write(text)
f.truncate()
else:
print(filepath, 'NO CHANGE')
New to PDI here. Need to output data from a view in a postgresql database to a file daily. The output file will be like xxxx_20160427.txt, so need to append the dynamic date in the file name. How to do it?
EDIT-----------------
I was not clear here by asking how to add dynamic date, I was trying to add not just date but optional other parts to the file name. E.g adding a serial no (01) at the end: xxxx_2016042701.txt etc. So my real question is how to make a dynamic file name? In other ETL tool e.g. SSIS it will be a simple expression. Not sure how it is done in PDI?
In your Text file output step, simply check "Include date in filename?" under the files tab.
You can create a dynamic filename variable with a Modified Java Script value STEP.
and then in the Text File Output STEP click on "Accept file name from field", and select your variable declared from previous step (filename_var on this example).
This question is related to Swift and Objective-C.
I want to create variables from Constant Strings. So, in future, when I change name of a variable though out app, I just need to change it at one place, it must be changed, wherever it is used.
Example:
I have user_id in 14 files, if I want to change user_id into userID I have to change in all 14 files, but I want to change at once place only.
One way to do this would be to use the Xcode build process and add a script (language can be of your choice, but the default is a BASH script)
Create string constant text file where you define all your variables you want to change in some format that expresses the change you want to make, for example:
"variable_one_name" = "new_variable_one_name"
Depending on how 'smart' you wanted your script to be you could also list all your variables and include some way of indicating when a variable is not to be replaced.
"variable_one_name" = "new_variable_one_name"
"variable_two_name" = "DO_NOT_CHANGE"
Run a pre build script on you project that reads in the string constant text file and then iterates through your source files and executes the desired replacement. Be careful to limit the directories you search to you OWN source files!
build project...
This would allow you to manage your constants from one place. However it clearly is only going to help you after you have created a project and written some code :)
BASH string replacement
Adding a run script to the Xcode build process
I'm attempting to create a process to import data. I created the entire process and it works, but I'm having trouble creating the variable to find the file name of the csv i want to import automatically. Each time a new csv is uploaded to me it has a timestamp on it. I want to be able to grab that file no matter what the name is and do work to it.
So for example this week the file name would be
filename_4-14-2014.csv
And next week
filename_4_21_2014.csv
And so on into eternity. . .
Is there a way to create a variable that picks up the full file name even though its changing?
After doing some poking around, I've discovered the following...
You can use a file system task to perform the copy operation I was referring to. You can set the input file and the output file as variables. This way you can always know that the file you use for import is always named the same, and has the right data.
You just need to add the variables and a File System Task to your package.
Ok so to accomplish what I wanted I created a Foreach Loop Container. Using the foreach loop container I had it look for any files ending with .csv in my specified folder by using a wildcard [denoted by asterisk: *.csv] .
Within the Foreach Loop container is as follows.
Step 1: File System Task - rename file.
Step 2: Data Flow Task - Import data to sql
Step 3: File System Task - Copy the file to another folder, append datetime to filename
Step 4: File System Task - Delete source file.
I used variables to get all the file and folder names plus datetimes.