How should a website validate a users mailing address?
This is for a site that relies on shipping items via UPS or FedEx. I know there is software out there that does it (http://en.wikipedia.org/wiki/Coding_Accuracy_Support_System), but if you are trying to build your own solution for a simple website.
String patterns that can be used to filter and group files
One of our application filters files in certain directory, extract some data from it and export a document from the extracted data. The algorithm for extracting the data depends on the file, and so far we use regex to select the algorithm to be used, for example .*.txt
will be processed by algorithm A, foo[0-5].xml
will be processed by algo B, etc.
Can the csv format be defined by a regex?
A colleague and I have recently argued over whether a pure regex is capable of fully encapsulating the csv format, such that it is capable of parsing all files with any given escape char, quote char, and separator char.
Constructing a Deterministic Finite State Automaton for a given Regex
I have a couple of exam questions for my compilers class and wanted to check if my solutions are correct.
Constructing a Finite State Automaton
I have an exam question that I am unsure of the answer. The question is:
How to choose a proper parser generator for PHP
Some programmers avoid regexes in some situations (see this popular @nickf comment), perhaps using a parsing framework such as Lex/Yacc. Others prefer to stay within PHP, perhaps using regular expressions, as it avoids the need for another framework.
Domain-specific language for text search/processing?
I work for an organization that does a lot of work with government data. We have a couple of different projects where we’ve abstracted out common text search/manipulation operations into reusable libraries, for things like standardizing the way politicians’ names are displayed (e.g., transforming “MCDONALD, BOB (R-VA)” into “Bob McDonald (R-VA)”), or finding legal citations in text (e.g., finding a reference to (e.g., finding occurrences of things like “1 U.S.C. 7” in text, determining that it’s a US Code citation, and returning a structure that says it’s referring to section 1 of title 7). These are relatively simple operations, and lots of collaborators in our space would like to use them, but we end up having to pick a language in which to implement each (the former is in Python; the latter, Javascript), and we freeze out potential consumers/contributors who work in different languages and don’t want to resort to hacks like shelling out to a node process to handle their text. This all seems like a shame because what we’re expressing is so simple, and ought, one would think, to be pretty easy to share.
Why nginx’s http parser doesnt use regular expressions?
I see the http parser written by Igor Sysoev for nginx does not use regular expressions
Why nginx’s http parser doesnt use regular expressions?
I see the http parser written by Igor Sysoev for nginx does not use regular expressions
Why nginx’s http parser doesnt use regular expressions?
I see the http parser written by Igor Sysoev for nginx does not use regular expressions