In ArcGIS or QGIS is it possible to add timestamps to geotiffs from a CSV file containing the timestamps? - arcgis

I have about 400 geotiffs consisting of satellite data obtained on different dates. I would like to make an animated time series from these data using GIS (either in ArcGIS or QGIS), Python or R. The issue I am having with the GIS approach is that the geotiffs do not have a 'date' field and so I cannot do this directly. However, I have a CSV file containing the geotiff file names in one column and the date for each geotiff in another. So my question is: Is it possible to populate the geotiffs with the time information contained in the CSV file? Furthermore, each of my geotiff filenames contain the date within them (in format YYYY-mm-dd), i.e. 'XYZ_2000-01-01.tif'.
So far I have tried using the Time Manager plugin in QGIS, with which I attempted to extract the date for each geotiff from the filename (see screenshot below). This was unsuccessful (Error = Could not find a suitable time format or date). I managed to create an animated time series for the data in R but I couldn't add a basemap, so the geographical context was lost.

You are specifying your date start wrongly and I think you probably want to add them as rasters not layers. Your date starts at a long way past 0 - 50 something. So I would rename all your files to put the date at the start to save counting all the way to the end.
My setup looks like (in QGIS Pi):

Related

Azure Data Factory - How to create multiple datasets and apply different treatments on files in same blob container?

Starting up with Azure Data factory here.
I have a scenario where I gather csv files (different sources and formats/templates) that I store in a single Azure blob container. I would like to extract the data to an SQL DB. I need to apply different treatments to the files before pushing the data to SQL, based on the format. The format is indicated in each file name (for example: Myfile-formatA-20201201).
I am unclear on my pipeline / datasets setup. I assume I need to create a new (input) dataset for each CSV format, but cannot find a way to create differentiated datasets by relying on the different naming pattern. If creating a single input dataset instead, I can create a pipeline with differentiated copy activity using the same single dataset created in input and applying different filtering rules (relying based on my files naming pattern) - which seems to be working fine for files having the same encoding, column delimiters etc.. but as expected, fails for other files that do not.
I could not find any official information on how to to apply filters on creating multiple datasets from files contained in the same container. Is it possible at all? Or is a prerequisite to store files with different format in different containers or directories?
I created a test to copy different format csv in one pipeline.Then select different copy activities according to the file name. I think this is the answer you want.
In my container, I created csv in two formats:
Creat a dataset to the input container:
Edit: Do not specify a file in the File Path
Using Get Metadata1 activity to get the Child items.
The output is as follows:
Then in ForEach1 activity, we can traverse this array. Add dynamic content #activity('Get Metadata1').output.childItems to the Items tab.
5.Inside ForEach1 activity, we can use Switch1 activity and add dynamic content #split(item().name,'-')[1] to the Expression. It will get the format name. Such as: Myfile-formatA-20201201 -> formatA.
Case default, we can copy csv files of fortmatA.
Edit: in order to select only files of with "formatA" in their name, in the copy activity, use the Wildcard file path option:
enter image description here
Key in #item().name , so we can specify one csv file.
Add formatB case:
Then use the same source dataset.
Edit: as in previous step, use the Wildcard file path option:
enter image description here
That's all. We can set different sink at these Copy activities.

Excel to CSV Plugin for Kettle

I am trying to develop a reusable component in Pentaho which will take an Excel file and convert it to a CSV with an encoding option.
In short, I need to develop a transformation that has an Excel input and a CSV output.
I don't know the columns in advance. The columns have to be dynamically injected to the excel input.
That's a perfect candidate for Pentaho Metadata Injection.
You should have a template transformation wich contains the basic workflow (read from the excel, write to the text file), but without specifiying the input and/or output formats. Then, you should store your metadata (the list of columns and their properties) somewhere. In Pentaho example an excel spreadsheet is used, but you're not limited to that. I've used a couple of database tables to store the metadata for example, one for the input format and another one for the output format.
Also, you need to have a transformation that has the Metadata Injection step to "inject" the metadata into the template transformation. What it basically does, is to create a new transformation at runtime, by using the template and the fields you set to be populated, and then it runs it.
Pentaho's example is pretty clear if you follow it step by step, and from that you can then create a more elaborated solution.
You'll need at least two steps in a transformation:
Input step: Microsoft Excel input
Output step: Text file output
So, Here is the solution. In your Excel Input Component, in Fields Section, mention maximum number of fields which will come in any excel. Then Route the Input excel to text field based on the Number of fields which are actually present. You need to play switch/case component here.

How to extract table data from pdf and store it in csv/excel using Automation Anywhere?

I want to extract the table data from pdf to excel/csv. How can I do this using Automation Anywhere?
Please find below the sample table from pdf document.
There are multiple ways to extract data from PDFs.
You can extract raw data, formatted data, or create form fields if the layout is consistent.
If the layout is more random, you might want to take a look at IQ Bot, where there are predefined classifications for things like Orders etc.
I would err on using form fields if you have unusual fonts like " for inches character if you have a standard format, since the encoding doesn't map well with the raw/formatted option.
The raw format has some quirks where you don't always get all the characters you expect, such as missing first letter of a data item for raw.
The formatted option is good at capturing tabular columns as they go across the line.

Can you set Fixed File Input column definitions dynamically in Pentaho data-integration (PDI)?

I have a metadata file which contains the column name, starting position, and length. I would like to read these values and define my columns within a FIXED FILE INPUT step.
Is there a way to do this in PDI? My file contains over 200 columns at a fixed widths and manually entering the information would be very time consuming especially if this definition changes over time.
Use MetaData Injection Step to inject the MetaData into prescribed steps , refer Matt Casters on figuring out delimited file and MetaData Injection description

Import Unformatted txt file into SQL

I am having an issue importing data into SQL from a text file. Not because I don't know how...but because the formatting is pretty much terrible for this purpose. Below is an altered sample of the types of text files I need to work with:
1 VA - P
2 VB to 1X P
3 VC to 1Y P
4 N - P
5 G to 1G,Frame P
6 Fout to 1G,Frame P
7 Open Breaker P
8 1B to 1X P
9 1C to 1Y P
Test Status: Pass
Hi-Pot # 1500V: Pass
Customer Order:904177-F
Number: G4901626-200
Serial Number: J245F6-2D03856
Catalog #: CBDC37-X5LE30-H40-L630C-4GJ-G31
Operator: TGY
Date: Aug 01, 2013
Start Time: 04:09:26
Finish Time: 04:09:33
The first 9 lines are all specific test results (tab separated), with header information below. My issue is that I need to figure out:
How can I take the data above and turn it into something broken down into a standard column format to import into SQL?
How can I then automate this such that I can loop through an entire folder structure?
-What you see above is one of hundreds of files divided into several sub-directories.
Also note that the # of test lines above the header information vary from file to file. The header information remains in much the same format though. This is all legacy data that cannot be regenerated, but needs to be imported into our SQL databases.
I am thinking of using an SSIS project with a custom script to import the data...splicing the top section from the bottom by looking for the first empty row...then pivot the data in the header into column format...merge...then move on. But I don't write much VB and I'm not sure how to approach that.
I am working in a SQL Server 2008R2 environment with access to BIDS.
Thoughts?
I would start by importing the data as all character into a table with a single field, one record per line. Then, from that table, you can parse each record into the fields and field types appropriate for each line. Hopefully there is a way to figure out what kind of data each line is, whether each file is consistant in order, or the header record indicates information about subsequent lines. From that, the data can be moved to a final (parsing may take more than one pass) table with the data stored in a format that is useable for whatever you need it.
I would first concentrate on getting the data into the database in the least complicated (and least error prone) way possible. Create a table with three columns: filename, line_number and line_data. Plop all of your files into that table and then you can start to think about how to interpret the data. I would probably be looking to use PIVOT, but if different files can have different numbers of fields it may introduce complications.
I would use a different approach and use SSDT/SSIS package to import the data.
Add a script component to read in the text file and convert it to XML. Not hard there many examples on the web. In your script Store the XML you develop into a variable.
Add a data flow
Add an XML Source. In the XML source you can select the XML variable you created and process either group of data present in your file. Here is some information on using the XML Source.
Add destination task to import it to a destination of your choice
This solution assumes your input lines are terminated {CR}{LF}, the normal Windows way.
Tell MSSQL's Import/Export Wizard to import a Flat File; the Format is "Delimited"; the "Text Qualifier" is the {CR}; the "Header Row Delimiter" is the {LF}; and the OutputColumnWidth (in "Advanced") is a bit more than the longest possible line length.
It's simple and it works.
I just used this to import 23 million lines of mixed up data, and it took less than ten minutes. Now to edit it...