Skip to content
This repository was archived by the owner on Feb 19, 2021. It is now read-only.

Conversation

@heinrich5991
Copy link

The regex was broken before, using (?!…) instead of (?<=…).

The regex was broken before, using `(?!…)` instead of `(?<=…)`.
skius
skius previously approved these changes Aug 19, 2019
@MasterofJOKers
Copy link
Contributor

Why do we need to lookbehind and lookahead? Can't we get away with something like this?

r = re.compile(r'(?:\b|[_-])('\
               r'(?:[0-9]{1,2}[./-][0-9]{1,2}[./-](?:[0-9]{4}|[0-9]{2}))|'\
               r'(?:(?:[0-9]{4}|[0-9]{2})[./-][0-9]{1,2}[./-][0-9]{1,2})|'\
               r'(?:[0-9]{1,2}\. +[^\W\d_]{3,9} (?:[0-9]{4}|[0-9]{2}))|'\
               r'(?:[^\W\d_]{3,9}(?: [0-9]{1,2},)? [0-9]{4})'\
               r')(?:\b|[_-])')

In some manual testing, it seems to match everything matched in the unit tests. We can then use m.group(1) for the date-part of the matched string.

👍 for the additional tests.

@heinrich5991
Copy link
Author

Updated with the suggestion to not use lookahead/lookbehind.

@heinrich5991
Copy link
Author

Removed all the superfluous (?:).

@MasterofJOKers
Copy link
Contributor

Removed all the superfluous (?:).

Great, could you also remove the superfluous \ in the [], while you're at it?

MasterofJOKers
MasterofJOKers previously approved these changes Nov 2, 2019
@heinrich5991
Copy link
Author

Removed the superfluous backslashes in the regex.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants