How to use the latest file in a folder for source

How to use the latest file in a folder for source - sql

I have an SSIS package which pulls in a CSV file for processing which pulls one file for the source \\server\dash\LABORDERS.CSV and is working fine.
We wanted to keep older files for historic purposes so everyday there will be new files instead of just overwriting the old one and it looks like this:
I know I am suppose to add a script task but I am not sure where to add it and how to invoke it so that the source file is always looking in the folder for the latest file and using that file to transfer data to it's sql destination.
How can I achieve it?

What have you tried? You can could create a script task at the start of your control flow that uses the .NET framework filesystem objects to search a directory and get the file with the most recent timestamp. You could then assign that file name to a SSIS Variable, then use that variable in your file connection manager.

Related

Is there an easy way to move a file to a different folder in dbt Cloud?

Is there an easy way to move a file to a different folder in dbt Cloud, without having to create a new file of the same name in the new folder, copy/paste from the old file, and delete the old file, which is a pain.
Is there a good reason I should NOT do this? I assume my refs still work as long as the filename remains the same, and I don't have any specific folder logic tied to this file.
For example, say I have my_model.sql in my 'staging' folder and I want to simply move it to my 'mart' folder instead. In this example I'd like to do this to reflect that my file is really a more 'stable' mart-type table file vs a staging view. I realize I can just change the materialization type, but I'm doing this more to organize things clearly. Thanks!

The way to move a file in the cloud IDE for dbt is not 100% obvious. You can use the rename function to move a file to another location.
Click on the drop down next to the file name, then select "Rename." That will open a file path and you can change where the file lives from there by typing in the new folder's name.

The easiest way I have found to do this is...not using DBT Cloud, but using github desktop (no command line needed).
Create a new branch
Open repository in github
View files in your file explorer
Move files or directory locally
Upload to github
Push to origin for the branch you created
Open a pull request
Merge
Depending on what the file or directory is you may find the creating a new branch and opening PR to be overkill. For my specific project there is a lot of legacy organization and models that we aren't totally sure don't have downstream dependencies, so creating a new branch for this allowed me to test run all of our models.
Hope this helps!

Is it possible to automate updating Tableau extract for Tableau Reader?

Situation now:
I have a data warehouse job profile that publishes .txt file in Data folder every day in the morning. I open Tableau workbook which automatically updates data visualisations because of union I made. I save this workbook as extract and collages without Tableau Desktop can view it via Tableau Reader.
What I need:
This reporting format is heavily dependent on me and I need to automate this.
Is this even possible without Tableau Server?

Since Tableau Viewer can only use packaged workbooks with extracted data, you may not directly achieve this.
However, you may automate the packaging process using Tableau's command line parameters and the process will not be dependent on anyone anymore.
You may check the .PDF file on below link. Using that help document, you may create a .BAT file and get that .BAT file periodically started using Task Scheduler on your computer. The users then may open the packaged file from the network location you have saved. Or else (If all user computers have Tableau Desktop installed) you may put the file opening line at the end of the .BAT file, so the user can run the .BAT when they want to see the report.
https://community.tableau.com/docs/DOC-5209

Bernardo was correct in saying the Extract API can be used to programatically create extracts, and thus "refresh" an extract by simply recreating it (the point about Tableau Server is only relevant if you want to publish the extract that you create with the Extract API).
Where you might have trouble is that there is no currently supported way to programatically replace an extract within a .twbx file. That said, it should be possible to do this by simply renaming the .twbx to .zip (it is after all just an archive) and then using something like Python's zip module to manipulate the archive to replace the extract with your new extract.
NB: The Extract API can only be used to create .hyper files. If you want to work with .tde files, then you'll need to use the Tableau SDK instead

getting some extra files without any extension on Azure Data Lake Store

I am using Azure data Lake Store for files Storage. I am using operations like
Creating a main file
Creating part files
Appending these part files to main file (for Concurrent append)
Example:
There is main log file (eventually will contain logs from all
programs)
There are part log file that each program creates solely and then
append to the main log file
The workflow runs really file but i have noticed some unknown file getting uploaded onto the store directory. These files name is a GUID an has no extension, moreover these unknown files are empty.
Does anyone knows what might be the reason for these extra files.

Thanks for reformatting your question. This looks like some processing artefacts that probably will disappear shortly after. How did you upload/create your files?

Get File when given Server File Location

Is it possible to get the actual file, or the file that gets copied from version control to a location?
This sounds confusing. Basically I have the file path of the version controlled file, but I need an actual path to the file because I need to make a cconsole command using powershell.exe. The file will look something like this
$/MyTeamProject/MyProject/Development/MyPowershellScript.ps1
Now, I am looking for a vb expression to see if I can get the actual file and make call the powershell.exe command from console. Any thoughts?

You may use VersionControlServer.GetItem(String path) to obtain a reference to the Item. Then use Item.DownloadFile() or Item.DownloadFile(String localPath) to copy the file locally. I have a variation of this that creates a shipment based on multiple changesets.

How do I run data flow task successfully if certain files in the data flow doesn't exist

I have a data flow task that imports excel files. I cant use a for each loop to go through the excel files as the metadata for each excel file is completely different.
So in the data flow task I have 10 separate source files and use a union component to combine them then import it to SQL.
Problem i am facing now is sometimes certain excel files that i am importing might not exist so when my package runs it will fail as the file doesn't exist. So is there any way for me to create a check that allows the package run to skip the source file that doesn't exist and run the rest of the source files?
I am using SSIS 2005.

Suggestion: if the file doesn't exist, then create it first.
Have an empty version of each source file somewhere, and in your control flow (before the data flow), check to see if the files exist, and if they don't, copy the blank files to the location of the real files.

This article explains how to perform a check if file exists mechanism in SSIS:
http://www.bidn.com/blogs/DevinKnight/ssis/76/does-file-exist-check-in-ssis

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to use the latest file in a folder for source - sql

Related

Is there an easy way to move a file to a different folder in dbt Cloud?

Is it possible to automate updating Tableau extract for Tableau Reader?

getting some extra files without any extension on Azure Data Lake Store

Get File when given Server File Location

How do I run data flow task successfully if certain files in the data flow doesn't exist

Categories

Resources