When responding to an HTTP request for a commonly used file on my web server, for example index.html, rather than reading the file each time, I list those popular resources in an array and read them when the server starts, so for all requests those files dont have to be read.
I think this might increase the efficiency of the web server if traffic reaches extreme levels. But I dont actually know how the file reader works internally, so I could be wrong. Does reading a file before-hand significantly improve web server performance with Node.js?
While Tim is right saying that system calls can be pretty fast, they are still slow compared to just getting a chunk from your own memory.
What’s more important though is that all other cache layers don’t have the same information that the application has. For example the OS doesn’t know that the file will be served using HTTP. The app does. It can therefore not only cache it, but also cache a gzipped version along with it, with which it can respond to all clients that accept gzipped responses (which is probably 99% of them). If a file gets requested at very high frequency, you don’t want to be gzipping it for every request.
So yes, it can make a huge difference, but the only way to tell is by measuring. Writing your own caches introduces a source of errors (memory leaks, stale state, …), so cache only the stuff that demonstrably causes a bottleneck. For everything else, stick with the most simple solution.
Keep in mind any caching done by the operating / file system of the server. There’s a pretty good chance that once opened, you’re reading them from the system’s cache. Keeping them in memory as you do becomes self-defeating only if the application is paged out, but you seem to be more interested in what would happen under heavy load, so that’s not an issue.
What you’re doing then, is basically saving the underlying system calls to
close() – which might be giving you a tiny speedup – but these are incredibly quick if the OS has the files in cache.
It’s probably a tiny bit of help at the cost of a few kilobytes of memory, but it wouldn’t really scale. If you had, say, hundreds or thousands of pages, you’re probably better off allowing the OS to cache them, or storing them in a manner where I/O isn’t a big bottleneck to retrieve them.