Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: percent encode dir_name before handing it over to FS #3489

Closed
FirefoxMetzger opened this issue Jan 28, 2023 · 2 comments · Fixed by #4017
Closed

bug: percent encode dir_name before handing it over to FS #3489

FirefoxMetzger opened this issue Jan 28, 2023 · 2 comments · Fixed by #4017
Assignees
Labels
bug Something isn't working

Comments

@FirefoxMetzger
Copy link

Describe the bug

FS uses three special characters (:, @, and !) to infer things about the path. This is perfectly fine when working with remote filesystems; however, locally this can cause issues because @ and ! are allowed on all major OSes (linux also allows :, but neither windows nor mac do).

As a result, FS gets confused when passing a local path that contains such characters (see PyFilesystem/pyfilesystem2#562 for details). Ideally, FS would deal with this automatically; however, an alternative (and perhaps more compatible) way of handling this would be for us to pass percent-encoded strings to FS. Instead of writing the following:

def copy_file_to_fs_folder(
src_path: str,
dst_fs: FS,
dst_folder_path: str = ".",
dst_filename: t.Optional[str] = None,
):
"""Copy the given file at src_path to dst_fs filesystem, under its dst_folder_path
folder with dst_filename as file name. When dst_filename is None, keep the original
file name.
"""
src_path = os.path.realpath(os.path.expanduser(src_path))
dir_name, file_name = os.path.split(src_path)
src_fs = fs.open_fs(dir_name)
dst_filename = file_name if dst_filename is None else dst_filename
dst_path = fs.path.join(dst_folder_path, dst_filename)
dst_fs.makedir(dst_folder_path, recreate=True)
fs.copy.copy_file(src_fs, file_name, dst_fs, dst_path)

we would call urllib.parse.quote just before (ofc with the import moved to an appropriate location):

def copy_file_to_fs_folder(
    src_path: str,
    dst_fs: FS,
    dst_folder_path: str = ".",
    dst_filename: t.Optional[str] = None,
):
    """Copy the given file at src_path to dst_fs filesystem, under its dst_folder_path
    folder with dst_filename as file name. When dst_filename is None, keep the original
    file name.
    """
    from urllib.parse import quote

    src_path = os.path.realpath(os.path.expanduser(src_path))
    dir_name, file_name = os.path.split(src_path)
    src_fs = fs.open_fs(quote(dir_name))
    dst_filename = file_name if dst_filename is None else dst_filename
    dst_path = fs.path.join(dst_folder_path, dst_filename)
    dst_fs.makedir(dst_folder_path, recreate=True)
    fs.copy.copy_file(src_fs, file_name, dst_fs, dst_path)

To reproduce

No response

Expected behavior

No response

Environment

Environment variable

BENTOML_DEBUG=''
BENTOML_QUIET=''
BENTOML_BUNDLE_LOCAL_BUILD=''
BENTOML_DO_NOT_TRACK=''
BENTOML_CONFIG=''
BENTOML_CONFIG_OPTIONS=''
BENTOML_PORT=''
BENTOML_HOST=''
BENTOML_API_WORKERS=''

System information

bentoml: 1.0.13
python: 3.8.8
platform: Windows-10-10.0.22000-SP0
is_window_admin: False

pip_packages
aiohttp==3.7.4.post0
anyio==3.6.2
appdirs==1.4.4
asgiref==3.6.0
async-timeout==3.0.1
attrs==22.2.0
backoff==2.2.1
bentoml==1.0.13
build==0.10.0
cattrs==22.2.0
certifi==2022.12.7
chardet==4.0.0
charset-normalizer==3.0.1
circus==0.18.0
click==8.1.3
click-option-group==0.5.5
cloudpickle==2.2.1
colorama==0.4.6
contextlib2==21.6.0
deepmerge==1.1.0
Deprecated==1.2.13
exceptiongroup==1.1.0
fs==2.4.16
googleapis-common-protos==1.58.0
h11==0.14.0
idna==3.4
Jinja2==3.1.2
joblib==1.2.0
markdown-it-py==2.1.0
MarkupSafe==2.1.2
mdurl==0.1.2
multidict==6.0.4
numpy==1.24.1
opentelemetry-api==1.14.0
opentelemetry-exporter-otlp-proto-http==1.14.0
opentelemetry-instrumentation==0.35b0
opentelemetry-instrumentation-aiohttp-client==0.35b0
opentelemetry-instrumentation-asgi==0.35b0
opentelemetry-proto==1.14.0
opentelemetry-sdk==1.14.0
opentelemetry-semantic-conventions==0.35b0
opentelemetry-util-http==0.35b0
packaging==21.3
pandas==1.5.3
pathspec==0.11.0
pip-requirements-parser==32.0.1
pip-tools==6.12.1
prometheus-client==0.16.0
protobuf==3.20.3
psutil==5.9.4
Pygments==2.14.0
pynvml==11.4.1
pyparsing==3.0.9
pyproject_hooks==1.0.0
python-dateutil==2.8.2
python-json-logger==2.0.4
python-multipart==0.0.5
pytz==2022.7.1
PyYAML==6.0
pyzmq==25.0.0
requests==2.28.2
rich==13.3.0
schema==0.7.5
scikit-learn==1.2.1
scipy==1.10.0
simple-di==0.1.5
six==1.16.0
sniffio==1.3.0
starlette==0.23.1
threadpoolctl==3.1.0
tomli==2.0.1
tornado==6.2
typing_extensions==4.4.0
urllib3==1.26.14
uvicorn==0.20.0
watchfiles==0.18.1
wrapt==1.14.1
yarl==1.8.2
@FirefoxMetzger FirefoxMetzger added the bug Something isn't working label Jan 28, 2023
@sauyon
Copy link
Contributor

sauyon commented Jan 29, 2023

Hm, looking at the comments on the FS issue seems like we should attempt to detect when it's a filesystem path and use OSFS, we should do that.

@frostming
Copy link
Contributor

It is more like a bug of pyfilesystem2, but it doesn't have updates for 9 months. We might replace it in the future, but now we can workaround it in BentoML.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants