Spaces:

optimum
/

llm-perf-leaderboard

Running

App Files Files Community

llm-perf-leaderboard / src /assets /text_content.py

BenchmarkBot

fix about

df1a500 over 1 year ago

raw

history blame

2.55 kB

	TITLE = """<h1 align="center" id="space-title">🤗 Open LLM-Perf Leaderboard 🏋️</h1>"""

	INTRODUCTION_TEXT = f"""
	The 🤗 Open LLM-Perf Leaderboard 🏋️ aims to benchmark the performance (latency & throughput) of Large Language Models (LLMs) with different hardwares, backends and optimizations using [Optimum-Benchmark](https://github.com/huggingface/optimum-benchmark) and [Optimum](https://github.com/huggingface/optimum) flavors.

	Anyone from the community can request a model or a hardware/backend/optimization configuration for automated benchmarking:
	- Model evaluation requests should be made in the [🤗 Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) and will be added to the 🤗 Open LLM-Perf Leaderboard 🏋️ automatically.
	- Hardware/Backend/Optimization performance requests should be made in the [community discussions](https://huggingface.co/spaces/optimum/llm-perf-leaderboard/discussions) to assess their relevance and feasibility.
	"""

	A100_TEXT = """<h3>Single-GPU Benchmark (1xA100):</h3>
	<ul>
	<li>Singleton Batch (1)</li>
	<li>Thousand Tokens (1000)</li>
	</ul>
	"""

	ABOUT_TEXT = """<h3>About the benchmarks:</h3>
	<ul>
	<li>The performances benchmarks were obtained using <a href="https://github.com/huggingface/optimum-benchmark">Optimum-Benchmark</a>.</li>
	<li>Throughput is measured in tokens per second when generating 1000 tokens with a batch size of 1.</li>
	<li>Peak memory is measured in MB during the first forward pass of the model (no warmup).</li>
	<li>Open LLM Score is an average evaluation score obtained from the <a href="https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard">🤗 Open LLM Leaderboard</a>.</li>
	<li>Open LLM Tradeoff is the euclidean distance between an LLM and the "perfect LLM" (i.e. 0 latency and 100% accuracy) translating the tradeoff between latency and accuracy.</li>
	</ul>
	"""

	CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results."
	CITATION_BUTTON_TEXT = r"""@misc{open-llm-perf-leaderboard,
	author = {Ilyas Moutawwakil, Régis Pierrard},
	title = {Open LLM-Perf Leaderboard},
	year = {2023},
	publisher = {Hugging Face},
	howpublished = "\url{https://huggingface.co/spaces/optimum/llm-perf-leaderboard}",
	@software{optimum-benchmark,
	author = {Ilyas Moutawwakil, Régis Pierrard},
	publisher = {Hugging Face},
	title = {Optimum-Benchmark: A framework for benchmarking the performance of Transformers models with different hardwares, backends and optimizations.},
	}
	"""