Skip to content

Conversation

@tgrandje
Copy link
Collaborator

Fix #286

Can anybody have a second look on the documentation's separators behaviour and my implementation of it ?

In particular concerning the espace ponctuation+blanc* rule which I understood as "whitespace separator" and "[ponctuation] plus any number of whitespace" (kind of a regex pattern with that * character) which doesn't make much sense (should be resumed to "whitespace OR punctuation" right ?

Note : I extracted the html table and parsed it, there should'nt be any typo in the SEPARATOR constant (I only dropped the espace ponctuation+blanc* rule).

Copy link
Collaborator

@tfardet tfardet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the hard work, I have a different understanding for the "ponctuation+blanc*" rule: from their example, the punctuation on its own is not a separator (maybe the * symbol in the name is misleading for regex users): "a.b." should stay "a.b" as a single word.

tgrandje and others added 2 commits December 15, 2025 09:06
Co-authored-by: tfardet <79037344+tfardet@users.noreply.github.com>
Co-authored-by: tfardet <79037344+tfardet@users.noreply.github.com>
@tgrandje
Copy link
Collaborator Author

You're right I missed the footnote... That's quite an unfortunate way to use a star in that context...!

@tfardet
Copy link
Collaborator

tfardet commented Dec 15, 2025

No worries, thanks for taking the time to fix that!

@tfardet tfardet merged commit 9ca80e1 into master Dec 15, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

search_sirene returns HTTP 400 when denominationUniteLegale contains '&' character

3 participants