Skip to content

daurnimator/lpeg_patterns

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A collection of LPEG patterns

Use cases

  • Strict validation of user input
  • Searching free-form input

Modules

core

A small module implementing commonly used rules from RFC-5234 appendix B.1

  • ALPHA (pattern)
  • BIT (pattern)
  • CHAR (pattern)
  • CR (pattern)
  • CRLF (pattern)
  • CTL (pattern)
  • DIGIT (pattern)
  • DQUOTE (pattern)
  • HEXDIG (pattern)
  • HTAB (pattern)
  • LF (pattern)
  • LWSP (pattern)
  • OCTET (pattern)
  • SP (pattern)
  • VCHAR (pattern)
  • WSP (pattern)

IPv4

  • IPv4address (pattern): parses an IPv4 address in dotted decimal notation. on success, returns addresses as an IPv4 object
  • IPv4_methods (table):
    • unpack (function): the IPv4 address as a series of 4 8 bit numbers
    • binary (function): the IPv4 address as a 4 byte binary string
  • IPv4_mt (table): metatable given to IPv4 objects
    • __index (table): IPv4_methods
    • __tostring (function): returns the IPv4 address in dotted decimal notation

IPv4 "dotted decimal notation" in this document refers to "strict" form (see RFC-6943 section 3.1.1) unless otherwise noted.

IPv6

  • IPv6address (pattern): parses an IPv6 address
  • IPv6addrz (pattern): parses an IPv6 address with optional "ZoneID" (see RFC-6874)
  • IPv6_methods (table): methods available on IPv6 objects
    • unpack (function): the IPv6 address as a series of 8 16bit numbers, optionally followed by zoneid
    • binary (function): the IPv6 address as a 16 byte binary string
    • setzoneid (function): set the zoneid of this IPv6 address
  • IPv6_mt (table): metatable given to IPv6 objects
    • __tostring (function): will return the IPv6 address as a valid IPv6 string

uri

Parses URIs as described in RFC-3986.

  • uri (pattern): on success, returns a table with fields: (similar to luasocket)
    • scheme
    • userinfo
    • host
    • port
    • path
    • query
    • fragment
  • absolute_uri (pattern): similar to uri, but does not permit fragments
  • uri_reference (pattern): similar to uri, but permits relative URIs
  • relative_part (pattern): matches a relative uri not including query and fragment; data is held in named group captures "userinfo", "host", "port", "path"
  • scheme (pattern): matches the scheme portion of a URI
  • userinfo (pattern): matches the userinfo portion of a URI
  • host (pattern): matches the host portion of a URI
  • IP_literal (pattern): matches an IP based host portion of a URI. Capture is an IPv4, IPv6 or IPvFuture object
  • port (pattern): matches the port portion of a URI
  • authority (pattern): matches the authority portion of a URI; data is held in named group captures of "userinfo", "host", "port"
  • path (pattern): matches the path portion of a URI. Captures nil for the empty path.
  • segment (pattern): matches a path segment (a piece of a path without a /)
  • query (pattern): matches the query portion of a URI
  • fragment (pattern): matches the fragment portion of a URI
  • sane_uri (pattern): a variant that shouldn't match things that people would not normally consider URIs. e.g. uris without a hostname
  • sane_host (pattern): a variant that shouldn't match things that people would not normally consider valid hosts.
  • sane_authority (pattern): a variant that shouldn't match things that people would not normally consider valid hosts.
  • pct_encoded (pattern): matches a percent encoded octet, produces a capture of the normalised form.
  • sub_delims (pattern): the set of subcomponent delimeters

email

  • mailbox (pattern): the mailbox format: matches either name_addr or an addr-spec.
  • name_addr (pattern): the name and address format i.e. Display Name<email@example.com> Has captures of the local_part and the domain. Captures the display name in the named capture "display"
  • email (pattern): also known as an "addr-spec"; follows RFC-5322 section 3.4.1 Has captures of the local_part and the domain Be careful trying to reconstruct the email address from the captures; you may need escaping
  • local_part (pattern): the bit before the @ in an email address
  • domain (pattern): the bit after the @ in an email address
  • email_nocfws (pattern): a variant that doesn't allow for comments or folding whitespace
  • local_part_nocfws (pattern): the bit before the @ in an email address; no comments or folding whitespace allowed.
  • domain_nocfws (pattern): the bit after the @ in an email address; no comments or folding whitespace allowed.

http

These patterns should be considered to have non stable APIs.

  • DAV (pattern)
  • Depth (pattern)
  • Destination (pattern)
  • If (pattern)
  • Lock_Token (pattern)
  • Overwrite (pattern)
  • TimeOut (pattern)
  • SLUG (pattern)
  • DASL (pattern)
  • Accept_Patch (pattern)
  • Link (pattern)
  • Set_Cookie (pattern)
  • Cookie (pattern)
  • Content_Disposition (pattern)
  • Origin (pattern)
  • Sec_WebSocket_Accept (pattern)
  • Sec_WebSocket_Key (pattern)
  • Sec_WebSocket_Extensions (pattern)
  • Sec_WebSocket_Protocol_Client (pattern)
  • Sec_WebSocket_Protocol_Server (pattern)
  • Sec_WebSocket_Version_Client (pattern)
  • Sec_WebSocket_Version_Server (pattern)
  • Schedule_Reply (pattern)
  • Schedule_Tag (pattern)
  • If_Schedule_Tag_Match (pattern)
  • Strict_Transport_Security (pattern)
  • X_Frame_Options (pattern)
  • Accept_Datetime (pattern)
  • Memento_Datetime (pattern)
  • request_line (pattern)
  • field_name (pattern)
  • field_value (pattern)
  • header_field (pattern)
  • OWS (pattern)
  • RWS (pattern)
  • BWS (pattern)
  • token (pattern)
  • qdtext (pattern)
  • quoted_string (pattern)
  • comment (pattern)
  • Content_Length (pattern)
  • Transfer_Encoding (pattern)
  • chunk_ext (pattern)
  • TE (pattern)
  • Trailer (pattern)
  • request_target (pattern)
  • Host (pattern)
  • Via (pattern): captures are a list of tables with fields .protocol, .by and .comment
  • Connection (pattern)
  • Upgrade (pattern): captures are a list of strings containing protocol or protocol/version
  • IMF_fixdate (pattern)
  • Content_Encoding (pattern)
  • Content_Type (pattern)
  • Content_Language (pattern)
  • Content_Location (pattern)
  • Expect (pattern)
  • Max_Forwards (pattern)
  • Accept (pattern)
  • Accept_Charset (pattern)
  • Accept_Encoding (pattern)
  • Accept_Language (pattern)
  • From (pattern)
  • Referer (pattern)
  • User_Agent (pattern)
  • Date (pattern): capture is a table in the same format as used by os.time
  • Location (pattern)
  • Retry_After (pattern): capture is either a table describing an absolute time in the same format as used by os.time, or a relative time as a number of seconds
  • Vary (pattern)
  • Allow (pattern)
  • Server (pattern)
  • Last_Modified (pattern): capture is a table in the same format as used by os.time
  • ETag (pattern)
  • If_Match (pattern)
  • If_None_Match (pattern)
  • If_Modified_Since (pattern): capture is a table in the same format as used by os.time
  • If_Unmodified_Since (pattern): capture is a table in the same format as used by os.time
  • Accept_Ranges (pattern)
  • Range (pattern)
  • If_Range (pattern): capture is either an entity_tag or a table in the same format as used by os.time
  • Content_Range (pattern)
  • Age (pattern)
  • Cache_Control (pattern): captures are grouped into key/value pairs (where a directive with no value has a value of true)
  • Expires (pattern): capture is a table in the same format as used by os.time
  • Pragma (pattern)
  • Warning (pattern)
  • WWW_Authenticate (pattern)
  • Authorization (pattern)
  • Proxy_Authenticate (pattern)
  • Proxy_Authorization (pattern)
  • Forwarded (pattern)
  • Public_Key_Pins (pattern)
  • Public_Key_Pins_Report_Only (pattern)
  • Hobareg (pattern)
  • Authentication_Info (pattern)
  • Proxy_Authentication_Info (pattern)
  • ALPN (pattern)
  • CalDAV_Timezones (pattern)
  • Alt_Svc (pattern)
  • Alt_Used (pattern)
  • Expect_CT (pattern)
  • Referrer_Policy (pattern)

phone

  • phone (pattern): includes detailed checking for:
    • USA phone numbers using the NANP

language

Patterns for definitions from RFC-4646 Section 2.1

  • langtag (pattern): Capture is a table with the language tag decomposed into components:
    • language
    • extlang (optional)
    • script (optional)
    • region (optional)
    • variant (optional): an array
    • extension (optional): a dictionary from singleton to value
    • privateuse (optional): an array
  • privateuse (pattern): captures an array
  • Language_Tag (pattern): captures the whole language tag
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy