Enforce collocation of data in the region where it is used

How to make sure that data specific to particular region remains in the closest data center to ensure low latency?

Lets consider Amazon e-commerce as an example . It sells products all over the world and not every product and product’s seller is available in every region. So there is no point in showing , lets say ABC speakers which are not sold in Australia, to customers in Europe.

So if the user in Australia wants to list all the speakers , a simple query where country='AUSTRALIA' will work ( in the simplest case)

Question 1:

Next comes how to resolve the latency part ( where my question is). How do we ensure that products sold in Australia are the only one that are present in Australian data center’s database. Because, if we fire the above query the partition ( or even the replica of the partition ) that carry the information about product =Speaker and country =Australia might be present in Japan.

As per my understanding, Amazon or such eCommerce will probably have elastic search DB cluster which is geographically spread and partitioning on key = country will not answer the question.

Question 2:

Is it a good idea to maintain separate database for each country to solve above issue?

This question even extends to Uber. Uber keeps track of all the rides that are available within all the regions of the world ( where Uber is actually available) in its Redis cluster. Now when a user wants to search for a ride in region-1 it will not be a good idea to send this request to USA because the partition that is handling the region of Australia is actually present in USA.

Can you please give some idea of how to make sure data is collocated with the region it is used in?

EDIT 1

In the below image you can see layout of the application and structure of product table. Basically DB cluster consist of Server S1, S2, S3 and partition is denoted by p* and Data Center is denoted by DC*

Assume the products Sony and Bose will map to partition p1.

In the diagram the user request has landed to DC-AUS (Data center Australia) but the products that are available in Australia are mapped to partition 1 which is present in Japan and USA.

enter image description here

11

How do we ensure that products sold in Australia are the only one that are present in Australian data center’s database.

you can have different URLs (sites) for different country . E.g: amazon.com.au

Going beyond the question: This way you can also follow country specific regulation. E.g: A database ( or any software ) need to have certain specific patch installed for auditing or compliance purpose.

Because, if we fire the above query the partition ( or even the replica of the partition ) that carry the information about product =Speaker and country =Australia might be present in Japan.

No, it would be present in Australia’s data center as well because you can have following option

  1. you will have different URL (and database) for each country.
    OR
  2. You will have complete replica of database in each data center. This mean each partition will be present in every data center. e.g : if your eCommerce db supports all country AUS, JAP, US, UK and you decided to have just one DB ( composed to servers and partition). The same DB (replica) will be present in each of the country’s data center.

Now when a user wants to search for a ride in region-1 it will not be a good idea to send this request to USA because the partition that is handling the region of Australia is actually present in USA.

As mentioned, the request will be sent to the closest data center which will contain the complete copy of database.

It seems like you are conflating two separate issues here

  1. How do restrict items so they can only be sold to customers residing in particular countries

    This is easy enough, work out where the customer resides, either by geolocating their IP or examining their payment or delivery information and compare that to the list of countries for the products.

  2. How do I optimise my traffic so that customers use a copy of my website which is hosted geographically close to them

    This is more complex, but there are several commercial products you can buy off the (cloud) shelf. Or more simply you can just have a different URL for each country.

Now, your final question

  1. Should each country specific site have the same database?

    This is more complex and depends on your business.

  • A Clothing retailer for example may not want to sell winter coats to Australians in summer. (bizarrely some do want to because fashion),

    They might have a completely different dataset for each season and each region

  • Fidget Spinners Inc might want to sell the same thing at the same price to everyone in the world

    They can just replicate their dataset globally

  • Computer Games Ltd might want to sell the same game but at a different price depending on where the customer is.

    They might want the same dataset but with region specific fields

7

Trả lời

Email của bạn sẽ không được hiển thị công khai. Các trường bắt buộc được đánh dấu *