Parquet Content-Defined Chunking
Datasets mentioned in this article 1
More Articles from our Blog
[

datasetsxethub
Streaming datasets: 100x More Efficient




- +1
86
October 27, 2025
](https://huggingface.co/blog/streaming-datasets)
[
![]()
parquetdedupestorage
Improving Parquet Dedupe on Hugging Face Hub
41
October 5, 2024
](https://huggingface.co/blog/improve_parquet_dedupe)
Community
Does the Hugging Face Xet Storage also work on top of (self-hosted) Minio?
Reply
![]()
•
@sfkeller Not today, but the underlying technology is open source and we're in the process of documenting the backend! We plan to release a Xet protocol later this year, which would open up the possibility to build for other backends as well. cc @rajatarya
👍
3
3
Reply
This is great! I will look forward to the release of Xet protocol.
❤️
3
3
Reply
I hope it is written in Rust :)
·
👍
1
1
![]()
Protocol + Format will be documentation, but there is the hf-xet implementation in Rust with xet-core.
deleted
This comment has been hidden
EditPreview
Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.
Tap or paste here to upload images
Comment
· Sign up or log in to comment
[Upvote
75](https://huggingface.co/login?next=%2Fblog%2Fparquet-cdc)