How to calculate new property for old records efficiently?

Let’s say we have classes something like this:

public class Item1{
    public Guid Id {get; set;}
    public string Name {get; set;}
    public decimal Price {get; set;}
    public int Quantity {get; set;}

And few more, which has additional or different fields, other have lists as properties.

And those classes are being serialized into Xml string and placed as property of another class, which will be saved in database.

public class Record{
    public Guid Id {get; set;}
    public DateTime CreatedOn {get; set;}
    public string SerializedItem {get; set;}

When we need, we pull out record and deserialize into class Item1, Item2 or whatever. Everything is nice and fine.

But one day we decide, that we need additional property Discount. If Price * Quantity > 1000 discount would be 10 for that item, if greater than 5000 discount 20 and so on. And we would use those classes for showing in a grid(simple list), generating reports and probably like a variable in classes ( e.g. if discount is greater than 20 that item would could not get a free shipping or something). In order not to duplicate Discount calculation, we decide to have additional property in our class Item1 like public decimal Discount {get; set;}, which is calculated while inserting this new item.

So here’s the problem we see: when deserializing old items they would not have that property, so deserializer would give a default values like null, 0, false depending on property type. However we want old items to have a Discount value calculated according to the logic that new items Discount is calculated.

So how to achieve that efficiently? Would it be wise to to have a method that would recalculate that discount(note that there might be a couple thousands of item, and logic would be more complicated) before giving these items to grid, reports and somewhere else?
Or is it more efficient to have a script which would loop through all records and would calculate discount, and run it once after upgrading to new version?

Any insights and suggestion are appreciated.
And please let me know or edit if title and tags are inappropriate or too vague.


So how to achieve that efficiently?

You need to store a version of your serialized data. You should also write a migration logic in your serializer so that if it encounters old version of the data, it would call data-migrator for it after deserialization.

After migration the data should be saved in a new format.

You can upgrade versions of your data on the first access attempt, or do this at software upgrade time.

But it would be better not to do that on a regular basis (in reports for example). If you will constantly keep old versions of your data intact, very soon you will have a version nightmare. You will have to support old versions everywhere in your app.

When you add a new column to a database table, you will usually either give the column a default or update the current rows in the same script. You’re sort of doing the same thing here, just with the extra indirection of having a serialized string instead of a table, it’s a little trickier.

Many SQL implementations have tools for parsing and writing XML, so you could write a script to scan the database and update all the records at once.

The drawback here is that this will take some time. If there are too many rows to update them all at once, you could do what Caleth suggested and update on access. Then, after enough records have been updated, run the script to get the last few.

If possible, refactoring the database to have a separate table for the items or an XML column instead of using XML strings could make future updates easier and faster. If you don’t want to go that route, adding a version number to your XML will make it a lot easier to tell what should be in the deserialized object. You don’t want a dozen if (property == null) { createProperty(); } lines in your deserializer.

Trả lời

Email của bạn sẽ không được hiển thị công khai. Các trường bắt buộc được đánh dấu *