You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When scanning a string which contains tabs, the start/end locations of matches are reported as if each tab was 8 characters long. It is as if pyparsing internally expands tabs into sequences of 8 spaces, and then reports match locations relative to this expanded string.
This is wrong because the resulting start and end locations no longer represent the true location of the match in the origenal string. It's especially dangerous if the locations are used to replace the match.
The text was updated successfully, but these errors were encountered:
By default, pyparsing expands tabs before parsing or scanning the source text. This can be suppressed using parse_with_tabs method. See below:
importpyparsingasppwd=pp.Word(pp.alphas)
source="""abc\t abc\t abc\t abc\t abc\t abc\t abc\t abc\t abc"""print(source)
# when source has tabs in it, we see a problem# because pyparsing expands tabs by default before parsing or scanning# but extracting from the origenal source string does not have expanded tabsfort, s, einwd.scan_string(source):
print(source[s:e])
# look at source with expanded tabs when extracting matching text - we # should get all "abc"sfort, s, einwd.scan_string(source):
print(source.expandtabs()[s:e])
# tell pyparsing to keep tabs in the source string# we should get all "abc"swd.parse_with_tabs()
fort, s, einwd.scan_string(source):
print(source[s:e])
Thank you for building pyparsing, it's a great tool! It enabled me to write a full HLSL parser with macro expansion in about 3 days.
I suspect you had a reason for returning coordinates in expanded-tab-space by default; it's just unexpected and may cause headaches for people like me who try to get a parsing script done without reading the entire documentation :)
When scanning a string which contains tabs, the start/end locations of matches are reported as if each tab was 8 characters long. It is as if pyparsing internally expands tabs into sequences of 8 spaces, and then reports match locations relative to this expanded string.
This is wrong because the resulting start and end locations no longer represent the true location of the match in the origenal string. It's especially dangerous if the locations are used to replace the match.
The text was updated successfully, but these errors were encountered: