Small scale document management system architecture / patterns

  softwareengineering

Im usually working with line of business desktop software. Mostly based on a single database.
Pretty often one of the requirements is to keep track of some files. Or the only way to implement a certain feature is to save actual files somewhere on the system and keep track of them via the database.

Is there any general wisdom on how to build this? Its pretty hard to google since all that comes up is commercial document management software.

In my experience these never stay small and rarely turn out well. I’m currently rebuilding the one in my application, but i’d like as much as info as possible.

How should i think about file system access? I’m using c# and technically i can access the file system from every layer, every class etc. Atm i try to think of it like database access and try to keep it in one repository.

What about the chicken and egg problem regarding deleting the db entry first vs deleting the physical file? If something goes wrong with the 2nd action, i either have a stranded file or db entry. I see no way around this though?

Goals I’m trying to achieve atm to keep some sanity

  • Keep the whole folder/filename definition in one place (Letters always land in the ..Letters folder, filenames start with Letter_ etc)
  • Keep all joining / building / removing of directory/filenames in one place (Don’t have Path.Combine all over the place)
  • In my application I’m working with sub paths, since the user can change the general path. I’m trying to keep all the sub path to full path logic in one class and just work with sub paths outside of it.

Just looking for links or experiences i guess.

0

This ends up looking very much like a traditional file system. You have metadata about your files (name, size, where they live on the store, status – live/writing/deleting/dead, directory info) and then you have a store that is the actual bits that is the files’ contents. In your case the metadata is in the DB and the bits are in a file, but the concepts are the same.

You then end up with a two phase commit sort of approach. You change the metadata status, then delete the bits, then commit the change. If the delete fails, you can detect that and clean up. If the commit fails, then you retry the delete and find out it’s already done.

And you’ll need to worry about concurrency (depending on your situation). How to handle that well gets involved and will vary a bit based on the tradeoffs you particularly want.

LEAVE A COMMENT