Hi all
Just wanted to get your thoughts on whether it's best to store (word/pdf) documents in the database or on the filesystem (and store a link to the file in the DB)?
In my app I'll need to store 2 versions of the same file - the original version, and a re-formatted version. The docs will all be either PDF or Word at this point (but other formats such as HTML may be added). I need to be able to search for matching text/data through both versions. It's vital that the link between the database record and the files is not broken at any point (i.e. I can't afford to have a situation where the database record is pointing to a non-existent document as it's been deleted etc).
I can see pros/cons of each approach (filesystem/DB) but wondered whether anyone had any advice/experience with doing it either way and can provide any comments or warn for potential gotcha's etc.
From my perspective, pros/cons are below:
Stored in DB
pros:
- Single point of failure
- If relationships setup correctly, won't be possible to 'orphan' documents from their associated records
- Can search using fulltext
- "Anywhere" availability - i.e. If I can access the db - I can access the documents
cons:
- Performance (possibly)
- Using a db for what essentially is a filesystem task
- If db is unavailable due to outage, everything is out
- db size will be impacted heavily (number of documents likely to be large)
Stored in filesystem
pros:
- Performance
- No space taken in DB
- Documents still available if DB becomes unavailable
cons:
- If file is deleted from outside of app, db will point to a file that no longer exists
- Now have 2 points of failure
- searching through documents may be more difficult (not checked this though)
My "gut" feel is that I'd prefer to use the DB but if I'm honest that's probably driven by my lack of experience of dealing with maintaining links to the filesystem and ensuring the integrety of those links remain solid (i.e. if a file is deleted outside of my app, how to deal with the link/missing file within the app).
If anyone has any experience with either approach I'd really appreciate your comments etc
Cheers
Martin