Relative Content

Tag Archive for file-systems

Watching file changes/additions/removal, but with an eye on partial transfer

I would like to monitor the filesystem in python, so that my application gets warned of the new file addition, file removal, or file change. Once the file is detected, the application starts extracting the contained data through various plugins. The problem is that I am dealing with big files, and when the user starts copying a file from outside into the watched directory, it will be detected, but it will appear as corrupted. Checking for file size between invocations is potentially a good strategy, but it ignores the fact that other generators of the file (such as wget) might have long pauses when the file is not changing in size, and yet is not completed. I don’t have control of the file format I am downloading either, so I can’t check for an end-of-file mark, because it could not be there.

Finding duplicate files? [duplicate]

This question already has answers here: Which hashing algorithm is best for uniqueness and speed? (11 answers) Closed 11 years ago. I am going to be developing a program that detects duplicate files and I was wondering what the best/fastest method would be to do this? I am more interested in what the best hash […]

Finding duplicate files? [duplicate]

This question already has answers here: Which hashing algorithm is best for uniqueness and speed? (11 answers) Closed 11 years ago. I am going to be developing a program that detects duplicate files and I was wondering what the best/fastest method would be to do this? I am more interested in what the best hash […]

Finding duplicate files? [duplicate]

This question already has answers here: Which hashing algorithm is best for uniqueness and speed? (11 answers) Closed 11 years ago. I am going to be developing a program that detects duplicate files and I was wondering what the best/fastest method would be to do this? I am more interested in what the best hash […]

Finding duplicate files? [duplicate]

This question already has answers here: Which hashing algorithm is best for uniqueness and speed? (11 answers) Closed 11 years ago. I am going to be developing a program that detects duplicate files and I was wondering what the best/fastest method would be to do this? I am more interested in what the best hash […]

Finding duplicate files? [duplicate]

This question already has answers here: Which hashing algorithm is best for uniqueness and speed? (11 answers) Closed 11 years ago. I am going to be developing a program that detects duplicate files and I was wondering what the best/fastest method would be to do this? I am more interested in what the best hash […]

Finding duplicate files? [duplicate]

This question already has answers here: Which hashing algorithm is best for uniqueness and speed? (11 answers) Closed 11 years ago. I am going to be developing a program that detects duplicate files and I was wondering what the best/fastest method would be to do this? I am more interested in what the best hash […]

Finding duplicate files? [duplicate]

This question already has answers here: Which hashing algorithm is best for uniqueness and speed? (11 answers) Closed 11 years ago. I am going to be developing a program that detects duplicate files and I was wondering what the best/fastest method would be to do this? I am more interested in what the best hash […]

Why is it not standard to implement abstraction layers for the file system?

I have been taught to access databases through abstraction layers. I was wondering why it is not also standard practice to access the file system through an abstraction layer? It seems to me unit testing would become much simpler, since mocking the filesystem is often a pain and poorly supported by some languages.