Skip to content

gh-152845: Keep EFS flag for a file loaded from the archive#152846

Open
danny0838 wants to merge 4 commits into
python:mainfrom
danny0838:gh-152845
Open

gh-152845: Keep EFS flag for a file loaded from the archive#152846
danny0838 wants to merge 4 commits into
python:mainfrom
danny0838:gh-152845

Conversation

@danny0838

@danny0838 danny0838 commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Introduce an internal _metadata_encoding attribute for ZipInfo to make sure that files read from an archive keep the original encoding and EFS flag, while any newly added file enforces EFS when having a non-ASCII filename or comment.

@serhiy-storchaka serhiy-storchaka self-requested a review July 2, 2026 10:05
@serhiy-storchaka

Copy link
Copy Markdown
Member

I tried to avoid adding new attributes. Let's see how we can solve this.

@danny0838

Copy link
Copy Markdown
Contributor Author

I agree that we should take care about new attributes. However it seems to be the most elegant way to handle the "preserve the original encoding only for files read from the archive" issue. Additionally it also works when someone tries to replicate a file by copying the ZipInfo object.

danny0838 and others added 3 commits July 2, 2026 18:58
Fix a regression introduced by pythongh-84353/pythongh-150091 where the EFS flag
was dropped or omitted when a file with an ASCII filename and a UTF-8
comment was written to an archive. This affected both newly added files
and existing files rewritten to the central directory in append mode,
causing an unexpected metadata change and leading to comment
mis-decoding.

Introduce an internal `_metadata_encoding` attribute for `ZipInfo` to
ensure that files read from an archive preserve their original encoding
and EFS flags, while newly added files now properly enforce EFS if they
contain a non-ASCII filename or comment.
Allow the `metadata_encoding` parameter in all modes, enabling proper
decoding with a customized codec in 'a' mode. This parameter is
ignored for 'w' and 'x' modes.
@serhiy-storchaka

Copy link
Copy Markdown
Member

@danny0838, please never use amend and force-push, at least after the start of review. It forces reviewers to start review from start instead of just looking at new changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants