I have a class that enforces database-friendly formatting for some of its properties in the __construct()
method.
This guarantees that if a new object is created and saved, it will be saved correctly. But saving new objects happens rarely. Retrieving them happens thousands of times more frequently, and these same rules are applied to the db-extracted data when it goes through __construct()
, a waste of resources.
Perhaps a bit of regex is not something I should worry about, but it bothers me. Am I doing this right?
If all of this stuff is enforced on the way in (write), should I avoid enforcing it on the way out (read)? And if so, how? If not, why not?
Potential Problems
- In case the database does have incorrectly stored data, it will be cleaned and misrepresented in my frontend.
- If I change the requirements and update the cleaning code, see 1.
More info:
I am saving objects that represent database fields. So I am storing their datatype (varchar, decimal, integer, etc), size (“25” or “9,2”, etc), default values, nullability, etc.
So the object name
attribute must conform to MySQL naming rules (cannot start with a number) plus some rules I’ve added just to simplify things (only numbers, ascii letters, and “_”). This will be used as a column name in tables.
The “size” needs to match the data type, as well.
This is all PHP and MySQL running on Ubuntu.
9
A good rule is to avoid having objects that are in a illegal state.
Therefore creating objects based on data from the database that is illegal (or indeed has become illegal because you have changed the rules) without anyone knowing about it would be bad.
It would of course also be bad if noone were able to start your system because one record somewhere were not legal, it could make the process of correcting those errors difficult.
So when loading data collect the errors and present them to the users, so that they can correct them. If you do not load all data at the start, make sure that the users understand that those errors are the errors found in this section, there might be others in other sections.
This would make it fairly easy to spot errors when the rules changes and to get them corrected as soon as possible.
If data is coming out of your database, then you can [usually] assume that it’s correct. It wil have gone through a lot of checking to make it through in the first place, so re-checking it on the way out again is [usually] overkill.
Note the peppering of ‘usually’s in the above.
Not all of the data that gets into your database does so through applications and, even if they do, not all of those applications can be guaranteed to be as stringent in their checking as you might hope. It is, therefore, possible to get data into your database that can be considered “wrong”. If you re-check data on the “way out” of the database, then you have no way of returning that errant data to the “outside World” where someone can do something about it.
Perhaps you need another “construct” method that takes an object returned from your database (RecordSet, DataReader, etc.) and populates the instance from that (no matter what it’s given).