Store files ordered by timestamp in Azure File Share - azure-storage

I have an application that writes files to Azure file share. But in Azure file share I see files are storing/inserting in alphabetical order. But I want them store/insert as per their insert timestamp. Is there any configuration or setting to keep the order of insertion in azure file share ?

Is there any configuration or setting to keep the order of insertion
in azure file share ?
No, as such there is no setting to change the order in which the files are stored/displayed. They are always sorted alphabetically. Furthermore, there is no server-side sorting available.
If you want to sort the files by insertion date/time, you will need to name the files in such a way so that the insertion time is included in the file name e.g. 2022-01-29T10:00:00Z-file1.txt, 2022-01-29T10:01:00Z-file1.txt.

Related

Mosaic Decisions Azure BLOB writer node creating multiple files

I’m using mosaic decisions data flow feature to read a file from Azure blob, do a few transformations and write that data back to Azure. It worked fine except that in the output file path I have given, it created a folder and I can see many files with some strange “part-000” etc in their names. What I need is a single file in that output location – Not many. Is there a way around this?
Mosaic-Decisions uses apache spark as its backend execution engine. In Spark, the dataframe read is split into multiple partitions and these partitions are written to the output location in parallel. That's the reason it creates multiple files at the target location with "part-0000", "part-0001" etc. (part here represents partition).
The workaround on this is to check "combine-output-files-into-one" in writer node. This will combine all of the part files into one big file. But use this with caution and only if you really need a single file - as this will come with a performance tradeoff.

Database model to manage documents

I need to build a tables related to manage documents such as jpg,doc,msg,pdf using a sql server 2008 .
As i know sql server support .jpg images, so my question is if it's possible to upload other kind of files into a db.
This is an example of the table (could be redefined if it's needed).
Document : document_id int(10)
name varchar(10)
type image (doesnt know how it might works)
Those are the initial values for a table, but i dont know how to make it useful for any type.
pd: do i need to assign a directory to save this documents into the server?
You can store almost any file type in an sql server table...if you do, you will almost certainly regret it.
Store a meta-data / a pointer to the file in your database instead, and store the files themselves on a disk directly where they belong.
Your database size - and thus hardware required to run it - will grow very rapidly, so you will be incurring large costs that you do not need to incur.
Use Filestream
https://learn.microsoft.com/en-us/sql/relational-databases/blob/filestream-sql-server
I know that a link-only answer is not an answer but I can't believe no one has mentioned it yet
The proper database design pattern is not to save Files into DBMS. You should develop a kind of File Manager Subsystem to manage your files for all of your projects.
File Manager Subsystem
This subsystem should be Reusable, Extendable, Secure and etc. All your projects that want to save Files, can use this subsystem.
Files can be saved in every where such as Local Hard, Network Drive, External Drives, Clouds and etc. So this subsystem should be design to support all kind of requests.
(you can improve the mentioned subsystem by adding a lot of features to it. for example checking duplicate files,...)
This subsystem, should generate a Unique Key for each file. After uploading and saving the files, the subsystem should generate that key.
Now, you can use this Unique Key to save in database (instead of file). Every time if you want to get the file, you can get the Unique Key from database and request to get file from the subsystem by unique key.

NSFileManager contentsEqualAtPath:andPath: compare checksum data

Does the NSFileManager method contentsEqualAtPath:andPath: create a dynamic checksum to compare two files, does it open the file header and compare file header details or does it use some other method for comparing?
I have a list of 200,000 or so files to compare where the local files are to be compared with the files on a remote server volume. The local files would have been copied from the remote server volume at some point in the past, and I will be walking the list of files to compare each and then copying over any newer files from the remote server volume to the local machine (overwriting any existing). There is no guarantee that the remote server files were created by the local user (and more than likely they would not have been).
As the files are small (approx. 4K in size) a complex file compare operation might take almost as long as a copy operation.
This operation could (conceivably but not likely) happen multiple times in a user session so I need to make sure that I am using the most efficient method for checking.
The operation itself will run on a separate thread so I don't have issues of tying up the user while the operation completes.
I've started the implementation to test this but was interested to see if anyone else has had any experience comparing thousands of files quickly in order to determine which files need updating if a newer one exists. And if you have, do you have any pointers or pitfalls to avoid?
Any advice much appreciated.
Update
Thinking about this some more, it might be more beneficial to keep a file that tracks the last updated timestamp of any changed images and keep a local file that does the same and just compare those two documents... Will update more as I progress.
It looks to me that for directories only filenames (and filenames of subdirectories) are compared. It only compares file content if you explicitly pass file paths to the method.

Storing uploaded content on a website

For the past 5 years, my typical solution for storing uploaded files (images, videos, documents, etc) was to throw everything into an "upload" folder and give it a unique name.
I'm looking to refine my methods for storing uploaded content and I'm just wondering what other methods are used / preferred.
I've considered storing each item in their own folder (folder name is the Id in the db) so I can preserve the uploaded file name. I've also considered uploading all media to a locked folder, then using a file handler, which you pass the Id of the file you want to download in the querystring, it would then read the file and send the bytes to the user. This is handy for checking access, and restricting bandwidth for users.
I think the file handler method is a good way to handle files, as long as you know to how make good use of resources on your platform of choice. It is possible to do stupid things like read a 1GB file into memory if you don't know what you are doing.
In terms of storing the files on disk it is a question of how many, what are the access patterns, and what OS/platform you are using. For some people it can even be advantageous to store files in a database.
Creating a separate directory per upload seems like overkill unless you are doing some type of versioning. My personal preference is to rename files that are uploaded and store the original name. When a user downloads I attach the original name again.
Consider a virtual file system such as SolFS. Here's how it can solve your task:
If you have returning visitors, you can have a separate container for each visitors (and name it by visitor login, for example). One of the benefits of this approach is that you can encrypt the container using visitor's password.
If you have many probably one-time visitors, you can have one or several containers with files grouped by date of upload.
Virtual file system lets you keep original filenames either as actual filesnames, or as a metadata for the files being stored.
Next, you can compress the data being stored in the container.

Vb.Net Document Storage

I am attempting to add a document storage module to our AR software.
I will be prompting the user to attach a doc/image to thier account. I will then put a copy of this file into our folder so that we can reference it without having to rely on them keeping the file in its original place. This system is not using a database but instead its using multiple flat files.
I am looking for guidance on how to handle these files once they have attached them to our system.
How should I store these attached files?
I was thinking I could copy the file over to a sub directory then renaming it to a auto-generated number so that we do not have duplicates. The bad thing about this, is the contents of the folder can get rather large.
Anyone have a better way? Should I create directories and store them...?
This system is not using a database but instead its using multiple flat files.
This sounds like a multi-user system. How are you handing concurrent access issues? Your answer to that will greatly influence anything we tell you here.
Since you aren't doing anything special with your other files to handle concurrent access, what I would do is add a new folder under your main data folder specifically for document storage, and write your user files there. Additionally, you need to worry about name collisions. To handle that, I'd name each file there with by appending the date and username to the original file name and taking the md5 or sha1 hash of that string. Then add a file to your other data files to map the hash values to original file names for users.
Given your constraints (and assuming a limited number of total users) I'd also be inclined to go with a "documents" folder -- plus a subfolder for each user. Each file name should include the date to prevent collisions. Over time, you'll have to deal with getting rid of old or outdated files either administratively or with a UI for users. Consider setting a maximum number of files or maximum byte count for each user. You'll also want to handle the files of departed users.