I have seen people struggling with patterns in strings. There is so much literature about this, since strings are very old data types. But nothing is more important that your CLI tool that no-one know how to use, but you. No, really, pattern matching is one of the most complex tasks in Data Science. But is really useful, and once you dominate it, you will rule.
- Filter strings efficiently. So, filter information efficiently.
- Look for patterns in a whole
- Lightweight
- Very integrated with most of GNU tools (grep, awk, etc.)
- Community and tutorials out there. Reddit
- Helps cleaning your dataset, preparing for ETL processes, etc.
Let’s begin.