파일 시스템 API
HfFileSystem 클래스는 fsspec
을 기반으로 Hugging Face Hub에 Python 파일 인터페이스를 제공합니다.
HfFileSystem
HfFileSystem은 fsspec
을 기반으로 하므로 제공되는 대부분의 API와 호환됩니다. 자세한 내용은 가이드 및 fsspec의 API 레퍼런스를 확인하세요.
class huggingface_hub.HfFileSystem
< source >( *args **kwargs )
Parameters
- token (
str
orbool
, optional) — A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co./docs/huggingface_hub/quick-start#authentication). To disable authentication, passFalse
.
Access a remote Hugging Face Hub repository as if were a local file system.
Usage:
>>> from huggingface_hub import HfFileSystem
>>> fs = HfFileSystem()
>>> # List files
>>> fs.glob("my-username/my-model/*.bin")
['my-username/my-model/pytorch_model.bin']
>>> fs.ls("datasets/my-username/my-dataset", detail=False)
['datasets/my-username/my-dataset/.gitattributes', 'datasets/my-username/my-dataset/README.md', 'datasets/my-username/my-dataset/data.json']
>>> # Read/write files
>>> with fs.open("my-username/my-model/pytorch_model.bin") as f:
... data = f.read()
>>> with fs.open("my-username/my-model/pytorch_model.bin", "wb") as f:
... f.write(data)
__init__
< source >( *args endpoint: Optional = None token: Union = None **storage_options )
Parameters
- use_listings_cache, listings_expiry_time, max_paths —
passed to
DirCache
, if the implementation supports directory listing caching. Pass use_listings_cache=False to disable such caching. skip_instance_cache — bool If this is a cachable implementation, pass True here to force creating a new instance even if a matching instance exists, and prevent storing this instance. asynchronous — bool loop — asyncio-compatible IOLoop or None
Docstring taken from fsspec documentation.
Create and configure file-system instance
Instances may be cachable, so if similar enough arguments are seen a new instance is not required. The token attribute exists to allow implementations to cache instances if they wish.
A reasonable default should be provided if there are no arguments.
Subclasses should call this method.
ls
< source >( path: str detail: bool = True refresh: bool = False revision: Optional = None **kwargs )
Docstring taken from fsspec documentation.
List objects at path.
This should include subdirectories and files at that location. The difference between a file and a directory must be clear when details are requested.
The specific keys, or perhaps a FileInfo class, or similar, is TBD, but must be consistent across implementations. Must include:
- full path to the entry (without protocol)
- size of the entry, in bytes. If the value cannot be determined, will
be
None
. - type of entry, “file”, “directory” or other
Additional information may be present, appropriate to the file-system, e.g., generation, checksum, etc.
May use refresh=True|False to allow use of self._ls_from_cache to check for a saved listing and avoid calling the backend. This would be common where listing may be expensive.