Spaces:
Running
Running
title: README | |
emoji: π | |
colorFrom: gray | |
colorTo: yellow | |
sdk: static | |
pinned: false | |
license: other | |
ivrit.ai is an effort to provide high-quality Hebrew datasets under a permissive license. | |
It is our hope that such datasets will be used to enable first-class support for Hebrew in AI models. | |
More about us can be found at ivrit.ai. | |
We are proud to present our latest achievements: | |
1) A state-of-the-art Hebrew speech-to-text model: https://huggingface.co/ivrit-ai/faster-whisper-v2-d4 | |
2) Our newest comprehensive Hebrew language dataset: https://huggingface.co/datasets/ivrit-ai/crowd-transcribe-v5 | |
Paper: https://arxiv.org/abs/2307.08720 | |
If you use our datasets, the following quote is preferable: | |
``` | |
@misc{marmor2023ivritai, | |
title={ivrit.ai: A Comprehensive Dataset of Hebrew Speech for AI Research and Development}, | |
author={Yanir Marmor and Kinneret Misgav and Yair Lifshitz}, | |
year={2023}, | |
eprint={2307.08720}, | |
archivePrefix={arXiv}, | |
primaryClass={eess.AS} | |
} | |
``` |