Relative Content

Tag Archive for text-processing

Tokenizing Text Held in a Rope Data Structure

I am building a text editor which makes use of a Ragel based tokenizer to support syntax highlighting. I am considering the use of a rope data structure to support efficient modifications and undo/redo operations. Is there a standard approach for tokenizing or searching text contained in this type of data structure? Some characters can cause the tokenizer to consume the rest of the stream.

Generate all possible outcomes from pattern or range

I have multiple patterns that I want to expand. Expansion should expand number and letter ranges between curly braces. Numbers need to support padding. I want to have it expand into a List(Of String) for ease of iteration. Patterns may include multiple sets of curly braces and they can be in any position.

Generate all possible outcomes from pattern or range

I have multiple patterns that I want to expand. Expansion should expand number and letter ranges between curly braces. Numbers need to support padding. I want to have it expand into a List(Of String) for ease of iteration. Patterns may include multiple sets of curly braces and they can be in any position.

Generate all possible outcomes from pattern or range

I have multiple patterns that I want to expand. Expansion should expand number and letter ranges between curly braces. Numbers need to support padding. I want to have it expand into a List(Of String) for ease of iteration. Patterns may include multiple sets of curly braces and they can be in any position.

Generate all possible outcomes from pattern or range

I have multiple patterns that I want to expand. Expansion should expand number and letter ranges between curly braces. Numbers need to support padding. I want to have it expand into a List(Of String) for ease of iteration. Patterns may include multiple sets of curly braces and they can be in any position.

Generate all possible outcomes from pattern or range

I have multiple patterns that I want to expand. Expansion should expand number and letter ranges between curly braces. Numbers need to support padding. I want to have it expand into a List(Of String) for ease of iteration. Patterns may include multiple sets of curly braces and they can be in any position.

Generate all possible outcomes from pattern or range

I have multiple patterns that I want to expand. Expansion should expand number and letter ranges between curly braces. Numbers need to support padding. I want to have it expand into a List(Of String) for ease of iteration. Patterns may include multiple sets of curly braces and they can be in any position.

Hardware accelerated text processing

Graphics processing units (GPUs) are very common and allow for efficient, parallel processing of floating point numbers.
PPUs (Physics Processing Units) used to be a buzzword several years ago but never really caught on and this kind of calculation is now handled by GPUs as well.
Both of these types of hardware are specialized in the processing of a specific kind of data.

Sorting Sentences by New Words in Each

A very useful learning tool I stumbled across for Chinese was a massive list of sentences that, barring the first 10 or 15, only differed by the ones before by one or two words, or at least as few as possible: The creator had sorted them by relation, obviously starting with a few standard sentences to prevent the first ones from being random and complex.