The most of the database management systems use B-tree as a data structure for increasing the performance.

Let’s imagine that we have a table users with the following columns: id (int), name (string), email (string), password (string), age (int). We want to select only ones that meet the condition: (name.contains("Bob") and not name.contains("A")) or (email.starts_with("angel") and (age >= 30 or age <= 20)). It’s a really complicated condition. How do the database management systems builds the B-tree and how do they move inside them to select the data? Which key should be used inside B-tree? If it’s achievable with another data structure type, it is also welcomed!

I am creating my own RDBMS and faced with this problem 🙂



When talking about databases we usually talk about indexes, they are often implemented using a B-Tree, but does not have to be, there are a wide variety of search structures that could be used, depending on the specific use case, hash tables, binary trees, R-Trees, and many others. Typically the primary key would always have a index, but other columns would have indexes created by the user.

Not all queries will be able to use indexes. A complicated query like yours might be able to use the index for the age to narrow down the number of items, and do a linear scan for the other conditions. If you make a sufficiently complicated query the database engine might decide to just do a full table scan. Indexes supporting queries like name.contains("A") would also need to be more complicated than a regular index, and may have limited availability.

There are composite indexes that can be used when querying multiple columns. I’m no database expert, but I would guess they just start using another column for the key at some depth level of the b-tree.

I would note that creating a general purpose RDBMS system is hard. So you need to carefully consider what your system will do that existing systems do not. It might also be easier to create a specialized system for your specific needs.


Trả lời

Email của bạn sẽ không được hiển thị công khai. Các trường bắt buộc được đánh dấu *