{ "metadata": { "name": "", "signature": "sha256:237609a5ef934bf65a93a410c9e5107b808049dd04b0faf2b30f9b423699ba6c" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "[Sebastian Raschka](http://sebastianraschka.com) \n", "\n", "- [Link to this IPython notebook on Github](https://github.com/rasbt/python_reference/blob/master/tutorials/useful_regex.ipynb) " ] }, { "cell_type": "code", "collapsed": false, "input": [ "%load_ext watermark" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 1 }, { "cell_type": "code", "collapsed": false, "input": [ "%watermark -d -v -u -t -z" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Last updated: 06/07/2014 22:50:23 EDT\n", "\n", "CPython 3.4.1\n", "IPython 2.1.0\n" ] } ], "prompt_number": 2 }, { "cell_type": "markdown", "metadata": {}, "source": [ "[More information](http://nbviewer.ipython.org/github/rasbt/python_reference/blob/master/ipython_magic/watermark.ipynb) about the `watermark` magic command extension." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

\n", "I would be happy to hear your comments and suggestions. \n", "Please feel free to drop me a note via\n", "[twitter](https://twitter.com/rasbt), [email](mailto:bluewoodtree@gmail.com), or [google+](https://plus.google.com/+SebastianRaschka).\n", "

" ] }, { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "A collection of useful regular expressions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
" ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Sections" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- [About the `re` module](#About-the-re-module)\n", "- [Identify files via file extensions](#Identify-files-via-file-extensions)\n", "- [Username validation](#Username-validation)\n", "- [Checking for valid email addresses](#Checking-for-valid-email-addresses)\n", "- [Check for a valid URL](#Check-for-a-valid-URL)\n", "- [Checking for numbers](#Checking-for-numbers)\n", "- [Validating dates](#Validating-dates)\n", "- [Time](#Time)\n", "- [Checking for HTML tags](#Checking-for-HTML-tags)\n", "- [Checking for IP addresses](#Checking-for-IP-addresses)\n", "- [Checking for MAC addresses](#Checking-for-MAC-addresses)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
" ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "About the `re` module" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[[back to top](#Sections)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The purpose of this IPython notebook is not to rewrite a detailed tutorial about regular expressions or the in-built Python `re` module, but to collect some useful regular expressions for copy&paste purposes." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The complete documentation of the Python `re` module can be found here [https://docs.python.org/3.4/howto/regex.html](https://docs.python.org/3.4/howto/regex.html). Below, I just want to list the most important methods for convenience:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- `re.match()` : Determine if the RE matches at the beginning of the string.\n", "- `re.search()` : Scan through a string, looking for any location where this RE matches.\n", "- `re.findall()` : Find all substrings where the RE matches, and returns them as a list.\n", "- `re.finditer()` : Find all substrings where the RE matches, and returns them as an iterator." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you are using the same regular expression multiple times, it is recommended to compile it for improved performance.\n", "\n", " compiled_re = re.compile(r'some_regexpr') \n", " for word in text:\n", " match = comp.search(compiled_re))\n", " # do something with the match\n", " \n", "**E.g., if we want to check if a string ends with a substring:**" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import re\n", "\n", "needle = 'needlers'\n", "\n", "# Python approach\n", "print(bool(any([needle.endswith(e) for e in ('ly', 'ed', 'ing', 'ers')])))\n", "\n", "# On-the-fly Regular expression in Python\n", "print(bool(re.search(r'(?:ly|ed|ing|ers)$', needle)))\n", "\n", "# Compiled Regular expression in Python\n", "comp = re.compile(r'(?:ly|ed|ing|ers)$') \n", "print(bool(comp.search(needle)))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "True\n", "True\n", "True\n" ] } ], "prompt_number": 3 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit -n 10000 -r 50 bool(any([needle.endswith(e) for e in ('ly', 'ed', 'ing', 'ers')]))\n", "%timeit -n 10000 -r 50 bool(re.search(r'(?:ly|ed|ing|ers)$', needle))\n", "%timeit -n 10000 -r 50 bool(comp.search(needle))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "10000 loops, best of 50: 2.74 \u00b5s per loop\n", "10000 loops, best of 50: 2.93 \u00b5s per loop" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\n", "10000 loops, best of 50: 1.28 \u00b5s per loop" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\n" ] } ], "prompt_number": 4 }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
" ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Identify files via file extensions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[[back to top](#Sections)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A regular expression to check for file extensions. \n", "\n", "Note: This approach is not recommended for thorough limitation of file types (parse the file header instead). However, this regex is still a useful alternative to e.g., a Python's `endswith` approach for quick pre-filtering for certain files of interest." ] }, { "cell_type": "code", "collapsed": false, "input": [ "pattern = r'(?i)(\\w+)\\.(jpeg|jpg|png|gif|tif|svg)$'\n", "\n", "# remove `(?i)` to make regexpr case-sensitive\n", "\n", "str_true = ('test.gif', \n", " 'image.jpeg', \n", " 'image.jpg',\n", " 'image.TIF'\n", " )\n", "\n", "str_false = ('test.pdf',\n", " 'test.gif.pdf',\n", " )\n", "\n", "for t in str_true:\n", " assert(bool(re.match(pattern, t)) == True), '%s is not True' %t\n", "for f in str_false:\n", " assert(bool(re.match(pattern, f)) == False), '%s is not False' %f" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 5 }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
" ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Username validation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[[back to top](#Sections)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Checking for a valid user name that has a certain minimum and maximum length.\n", "\n", "Allowed characters:\n", "- letters (upper- and lower-case)\n", "- numbers\n", "- dashes\n", "- underscores" ] }, { "cell_type": "code", "collapsed": false, "input": [ "min_len = 5 # minimum length for a valid username\n", "max_len = 15 # maximum length for a valid username\n", "\n", "pattern = r\"^(?i)[a-z0-9_-]{%s,%s}$\" %(min_len, max_len)\n", "\n", "# remove `(?i)` to only allow lower-case letters\n", "\n", "\n", "\n", "str_true = ('user123', '123_user', 'Username')\n", " \n", "str_false = ('user', 'username1234_is-way-too-long', 'user$34354')\n", "\n", "for t in str_true:\n", " assert(bool(re.match(pattern, t)) == True), '%s is not True' %t\n", "for f in str_false:\n", " assert(bool(re.match(pattern, f)) == False), '%s is not False' %f" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 6 }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
" ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Checking for valid email addresses" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[[back to top](#Sections)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A regular expression that captures most email addresses." ] }, { "cell_type": "code", "collapsed": false, "input": [ "pattern = r\"(^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\\.[a-zA-Z0-9-.]+$)\"\n", "\n", "str_true = ('test@mail.com',)\n", " \n", "str_false = ('testmail.com', '@testmail.com', 'test@mailcom')\n", "\n", "for t in str_true:\n", " assert(bool(re.match(pattern, t)) == True), '%s is not True' %t\n", "for f in str_false:\n", " assert(bool(re.match(pattern, f)) == False), '%s is not False' %f" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 7 }, { "cell_type": "markdown", "metadata": {}, "source": [ "source: [http://stackoverflow.com/questions/201323/using-a-regular-expression-to-validate-an-email-address](http://stackoverflow.com/questions/201323/using-a-regular-expression-to-validate-an-email-address)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
" ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Check for a valid URL" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[[back to top](#Sections)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Checks for an URL if a string ...\n", "\n", "- starts with `https://`, or `http://`, or `www.`\n", "- or ends with a dot extension" ] }, { "cell_type": "code", "collapsed": false, "input": [ "pattern = '^(https?:\\/\\/)?([\\da-z\\.-]+)\\.([a-z\\.]{2,6})([\\/\\w \\.-]*)*\\/?$'\n", "\n", "str_true = ('https://github.com', \n", " 'http://github.com',\n", " 'www.github.com',\n", " 'github.com',\n", " 'test.de',\n", " 'https://github.com/rasbt',\n", " 'test.jpeg' # !!! \n", " )\n", " \n", "str_false = ('testmailcom', 'http:testmailcom', )\n", "\n", "for t in str_true:\n", " assert(bool(re.match(pattern, t)) == True), '%s is not True' %t\n", "\n", "for f in str_false:\n", " assert(bool(re.match(pattern, f)) == False), '%s is not False' %f" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 8 }, { "cell_type": "markdown", "metadata": {}, "source": [ "source: [http://code.tutsplus.com/tutorials/8-regular-expressions-you-should-know--net-6149](http://code.tutsplus.com/tutorials/8-regular-expressions-you-should-know--net-6149)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
" ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Checking for numbers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[[back to top](#Sections)]" ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Positive integers" ] }, { "cell_type": "code", "collapsed": false, "input": [ "pattern = '^\\d+$'\n", "\n", "str_true = ('123', '1', )\n", " \n", "str_false = ('abc', '1.1', )\n", "\n", "for t in str_true:\n", " assert(bool(re.match(pattern, t)) == True), '%s is not True' %t\n", "\n", "for f in str_false:\n", " assert(bool(re.match(pattern, f)) == False), '%s is not False' %f" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 9 }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Negative integers" ] }, { "cell_type": "code", "collapsed": false, "input": [ "pattern = '^-\\d+$'\n", "\n", "str_true = ('-123', '-1', )\n", " \n", "str_false = ('123', '-abc', '-1.1', )\n", "\n", "for t in str_true:\n", " assert(bool(re.match(pattern, t)) == True), '%s is not True' %t\n", "\n", "for f in str_false:\n", " assert(bool(re.match(pattern, f)) == False), '%s is not False' %f" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 10 }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "All integers" ] }, { "cell_type": "code", "collapsed": false, "input": [ "pattern = '^-{0,1}\\d+$'\n", "\n", "str_true = ('-123', '-1', '1', '123',)\n", " \n", "str_false = ('123.0', '-abc', '-1.1', )\n", "\n", "for t in str_true:\n", " assert(bool(re.match(pattern, t)) == True), '%s is not True' %t\n", "\n", "for f in str_false:\n", " assert(bool(re.match(pattern, f)) == False), '%s is not False' %f" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 11 }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Positive numbers" ] }, { "cell_type": "code", "collapsed": false, "input": [ "pattern = '^\\d*\\.{0,1}\\d+$'\n", "\n", "str_true = ('1', '123', '1.234', )\n", " \n", "str_false = ('-abc', '-123', '-123.0')\n", "\n", "for t in str_true:\n", " assert(bool(re.match(pattern, t)) == True), '%s is not True' %t\n", "\n", "for f in str_false:\n", " assert(bool(re.match(pattern, f)) == False), '%s is not False' %f" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 12 }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Negative numbers" ] }, { "cell_type": "code", "collapsed": false, "input": [ "pattern = '^-\\d*\\.{0,1}\\d+$'\n", "\n", "str_true = ('-1', '-123', '-123.0', )\n", " \n", "str_false = ('-abc', '1', '123', '1.234', )\n", "\n", "for t in str_true:\n", " assert(bool(re.match(pattern, t)) == True), '%s is not True' %t\n", "\n", "for f in str_false:\n", " assert(bool(re.match(pattern, f)) == False), '%s is not False' %f" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 13 }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "All numbers" ] }, { "cell_type": "code", "collapsed": false, "input": [ "pattern = '^-{0,1}\\d*\\.{0,1}\\d+$'\n", "\n", "str_true = ('1', '123', '1.234', '-123', '-123.0')\n", " \n", "str_false = ('-abc')\n", "\n", "for t in str_true:\n", " assert(bool(re.match(pattern, t)) == True), '%s is not True' %t\n", "\n", "for f in str_false:\n", " assert(bool(re.match(pattern, f)) == False), '%s is not False' %f" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 14 }, { "cell_type": "markdown", "metadata": {}, "source": [ "source: [http://stackoverflow.com/questions/1449817/what-are-some-of-the-most-useful-regular-expressions-for-programmers](http://stackoverflow.com/questions/1449817/what-are-some-of-the-most-useful-regular-expressions-for-programmers)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
" ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Validating dates" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[[back to top](#Sections)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Validates dates in `mm/dd/yyyy` format." ] }, { "cell_type": "code", "collapsed": false, "input": [ "pattern = '^(0[1-9]|1[0-2])\\/(0[1-9]|1\\d|2\\d|3[01])\\/(19|20)\\d{2}$'\n", "\n", "str_true = ('01/08/2014', '12/30/2014', )\n", " \n", "str_false = ('22/08/2014', '-123', '1/8/2014', '1/08/2014', '01/8/2014')\n", "\n", "for t in str_true:\n", " assert(bool(re.match(pattern, t)) == True), '%s is not True' %t\n", "\n", "for f in str_false:\n", " assert(bool(re.match(pattern, f)) == False), '%s is not False' %f" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 15 }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[[back to top](#Sections)]" ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "12-Hour format" ] }, { "cell_type": "code", "collapsed": false, "input": [ "pattern = r'^(1[012]|[1-9]):[0-5][0-9](\\s)?(?i)(am|pm)$'\n", "\n", "str_true = ('2:00pm', '7:30 AM', '12:05 am', )\n", " \n", "str_false = ('22:00pm', '14:00', '3:12', '03:12pm', )\n", "\n", "for t in str_true:\n", " assert(bool(re.match(pattern, t)) == True), '%s is not True' %t\n", "\n", "for f in str_false:\n", " assert(bool(re.match(pattern, f)) == False), '%s is not False' %f" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 29 }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "24-Hour format" ] }, { "cell_type": "code", "collapsed": false, "input": [ "pattern = r'^([0-1]{1}[0-9]{1}|20|21|22|23):[0-5]{1}[0-9]{1}$'\n", "\n", "str_true = ('14:00', '00:30', )\n", " \n", "str_false = ('22:00pm', '4:00', )\n", "\n", "for t in str_true:\n", " assert(bool(re.match(pattern, t)) == True), '%s is not True' %t\n", "for f in str_false:\n", " assert(bool(re.match(pattern, f)) == False), '%s is not False' %f" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 18 }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
" ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Checking for HTML tags" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[[back to top](#Sections)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Also this regex is only recommended for \"filtering\" purposes and not a ultimate way to parse HTML. For more information see this excellent discussion on StackOverflow: \n", "[http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/) " ] }, { "cell_type": "code", "collapsed": false, "input": [ "pattern = r\"\"\"\\s]+))?)+\\s*|\\s*)/?>\"\"\"\n", "\n", "str_true = ('', '', '', '')\n", " \n", "str_false = ('a>', '')\n", "\n", "for t in str_true:\n", " assert(bool(re.match(pattern, t)) == True), '%s is not True' %t\n", "\n", "for f in str_false:\n", " assert(bool(re.match(pattern, f)) == False), '%s is not False' %f" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 16 }, { "cell_type": "markdown", "metadata": {}, "source": [ "source: [http://haacked.com/archive/2004/10/25/usingregularexpressionstomatchhtml.aspx/](http://haacked.com/archive/2004/10/25/usingregularexpressionstomatchhtml.aspx/)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
" ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Checking for IP addresses" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[[back to top](#Sections)]" ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "IPv4" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![](../Images/Ipv4_address.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Image source: http://en.wikipedia.org/wiki/File:Ipv4_address.svg" ] }, { "cell_type": "code", "collapsed": false, "input": [ "pattern = r'^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$'\n", "\n", "str_true = ('172.16.254.1', '1.2.3.4', '01.102.103.104', )\n", " \n", "str_false = ('17216.254.1', '1.2.3.4.5', '01 .102.103.104', )\n", "\n", "for t in str_true:\n", " assert(bool(re.match(pattern, t)) == True), '%s is not True' %t\n", "\n", "for f in str_false:\n", " assert(bool(re.match(pattern, f)) == False), '%s is not False' %f" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 8 }, { "cell_type": "markdown", "metadata": {}, "source": [ "source: [http://answers.oreilly.com/topic/318-how-to-match-ipv4-addresses-with-regular-expressions/](http://answers.oreilly.com/topic/318-how-to-match-ipv4-addresses-with-regular-expressions/)" ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Ipv6" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![](../Images/Ipv6_address.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Image source: http://upload.wikimedia.org/wikipedia/commons/1/15/Ipv6_address.svg" ] }, { "cell_type": "code", "collapsed": false, "input": [ "pattern = r'^\\s*((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d)(\\.(25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d)(\\.(25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d)(\\.(25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d)(\\.(25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d)(\\.(25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d)(\\.(25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d)(\\.(25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d)){3}))|:)))(%.+)?\\s*$'\n", "\n", "str_true = ('2001:470:9b36:1::2',\n", " '2001:cdba:0000:0000:0000:0000:3257:9652', \n", " '2001:cdba:0:0:0:0:3257:9652', \n", " '2001:cdba::3257:9652', )\n", " \n", "str_false = ('1200::AB00:1234::2552:7777:1313', # uses `::` twice\n", " '1200:0000:AB00:1234:O000:2552:7777:1313', ) # contains an O instead of 0\n", "\n", "for t in str_true:\n", " assert(bool(re.match(pattern, t)) == True), '%s is not True' %t\n", "\n", "for f in str_false:\n", " assert(bool(re.match(pattern, f)) == False), '%s is not False' %f" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 21 }, { "cell_type": "markdown", "metadata": {}, "source": [ "source: [http://snipplr.com/view/43003/regex--match-ipv6-address/](http://snipplr.com/view/43003/regex--match-ipv6-address/)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
" ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Checking for MAC addresses" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[[back to top](#Sections)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![](../Images/MACaddressV3.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Image source: http://upload.wikimedia.org/wikipedia/en/3/37/MACaddressV3.png " ] }, { "cell_type": "code", "collapsed": false, "input": [ "pattern = r'^(?i)([0-9A-F]{2}[:-]){5}([0-9A-F]{2})$'\n", "\n", "str_true = ('94-AE-70-A0-66-83', \n", " '58-f8-1a-00-44-c8',\n", " '00:A0:C9:14:C8:29'\n", " , )\n", " \n", "str_false = ('0:00:00:00:00:00', \n", " '94-AE-70-A0 -66-83', ) \n", "\n", "for t in str_true:\n", " assert(bool(re.match(pattern, t)) == True), '%s is not True' %t\n", "\n", "for f in str_false:\n", " assert(bool(re.match(pattern, f)) == False), '%s is not False' %f" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 29 } ], "metadata": {} } ] } pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Alternative Proxies: