While non-AI software engineering world is still getting used to metrics, metrics are an important part of AI world as well.
There might be such:
Quality of the chatbot responses. Thumbs up. Thumbs down.
Chat message follow-up rate.
Accuracy. Precision. Recall. R1 Score. Hallucination rate.
Inference Latency. Time to first token. Time per full answer. Time per token.
P70, P95 tokens per response. Tokens per prompt.
Memory usage. GPU/CPU Utilization.
Cost per job.
Median, P75 of agents involved in a job.
Energy consumption.
Labeling accuracy.
Error rates.
There are more. Will be more. Some will be valuable in a given context and time. The other ones will be pointless.
Sources:
AI Engineering by Chip Huyen (O’Reilly). Copyright 2025 Developer Experience Advisory LLC, 978-1-098-16630-4
Additional research