README.md

LLM Simulation

We loop in the GenZ LLM simulator. Note that this is not the LLM simulator we used in the RAGO paper, as Google's production LLM simulator is not open-sourced yet. So the performance number generated here is slightly different to the RAGO paper.

Estimate LLM serving performance

The example scripts can be executed by:

cd genz_scripts
python llm_perf.py

To configure the models and hardware to use, edit llm_perf.py.

The generated results are saved in genz_scripts/perf_results. For example, the performance of the main LLM can be found in genz_scripts/perf_results/main_llm_perf.csv.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM Simulation

Estimate LLM serving performance

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

LLM Simulation

Estimate LLM serving performance