Zoo API: Natural Language Processing (NLP)
The xinfer::zoo::nlp module provides a suite of high-performance pipelines for common Natural Language Processing tasks.
These classes are built on top of hyper-optimized TensorRT engines for state-of-the-art Transformer and sequence models. They are designed to bring high-throughput, low-latency language understanding to your native C++ applications, enabling tasks from real-time sentiment analysis to complex document processing.
A core component for these pipelines is a tokenizer, which converts raw text into integer IDs that the models can understand. xInfer assumes you will manage tokenization using a library like SentencePiece or Hugging Face Tokenizers C++, and the zoo classes will take these token IDs as input.
Classifier
Performs text classification. Given a piece of text, it assigns it to a pre-defined category (e.g., sentiment, topic, intent).
Header: #include <xinfer/zoo/nlp/classifier.h>
#include <xinfer/zoo/nlp/classifier.h>
#include <iostream>
#include <string>
#include <vector>
int main() {
// 1. Configure the text classifier.
// The engine would be a pre-built BERT or DistilBERT model.
xinfer::zoo::nlp::ClassifierConfig config;
config.engine_path = "assets/sentiment_bert.engine";
config.labels_path = "assets/sentiment_labels.txt"; // e.g., "negative", "positive"
config.vocab_path = "assets/bert_vocab.txt"; // For the internal tokenizer
// 2. Initialize.
xinfer::zoo::nlp::Classifier classifier(config);
// 3. Predict the sentiment of a sentence.
std::string text = "xInfer is an incredibly fast and easy-to-use library!";
auto results = classifier.predict(text, 2); // Get top 2 results
// 4. Print the results.
std::cout << "Sentiment analysis for: \"" << text << "\"\n";
for (const auto& result : results) {
printf(" - Label: %-10s, Confidence: %.4f\n", result.label.c_str(), result.confidence);
}
}```
**Config Struct:** `ClassifierConfig`
**Input:** `std::string` of text.
**Output Struct:** `TextClassificationResult` (contains class ID, label, and confidence).
---
## `Embedder`
Converts a piece of text into a fixed-size, high-dimensional vector (an "embedding") that captures its semantic meaning. This is the backbone of modern semantic search and RAG systems.
**Header:** `#include <xinfer/zoo/nlp/embedder.h>`
```cpp
#include <xinfer/zoo/nlp/embedder.h>
#include <iostream>
#include <string>
#include <vector>
int main() {
// 1. Configure the embedder.
// The engine would be a pre-built Sentence-BERT model.
xinfer::zoo::nlp::EmbedderConfig config;
config.engine_path = "assets/sentence_bert.engine";
config.vocab_path = "assets/bert_vocab.txt";
// 2. Initialize.
xinfer::zoo::nlp::Embedder embedder(config);
// 3. Create embeddings for a list of sentences.
std::vector<std::string> texts = {
"The cat sat on the mat.",
"A feline was resting on the rug."
};
std::vector<xinfer::zoo::nlp::TextEmbedding> embeddings = embedder.predict_batch(texts);
// 4. Compare the embeddings using cosine similarity.
float similarity = xinfer::zoo::nlp::Embedder::compare(embeddings, embeddings);
std::cout << "Semantic similarity between the two sentences: " << similarity << std::endl;
}Config Struct: EmbedderConfig
Input: std::string or std::vector<std::string>.
Output: TextEmbedding (a std::vector<float>).
NER (Named Entity Recognition)
Scans a piece of text and extracts named entities like people, organizations, and locations.
Header: #include <xinfer/zoo/nlp/ner.h>
#include <xinfer/zoo/nlp/ner.h>
#include <iostream>
#include <string>
int main() {
// 1. Configure the NER pipeline.
xinfer::zoo::nlp::NERConfig config;
config.engine_path = "assets/ner_bert.engine";
config.labels_path = "assets/ner_labels.txt"; // e.g., "B-PER", "I-PER", "B-ORG"
config.vocab_path = "assets/bert_vocab.txt";
// 2. Initialize.
xinfer::zoo::nlp::NER ner_pipeline(config);
// 3. Predict.
std::string text = "Apple Inc. was founded by Steve Jobs in Cupertino.";
auto entities = ner_pipeline.predict(text);
// 4. Print the extracted entities.
std::cout << "Found " << entities.size() << " entities:\n";
for (const auto& entity : entities) {
std::cout << " - Text: \"" << entity.text << "\", Label: " << entity.label << "\n";
}
}Config Struct: NERConfig
Input: std::string.
Output Struct: NamedEntity (contains the text, label, score, and position).
QuestionAnswering
Finds the answer to a question within a given context paragraph.
Header: #include <xinfer/zoo/nlp/question_answering.h>
#include <xinfer/zoo/nlp/question_answering.h>
#include <iostream>
#include <string>
int main() {
xinfer::zoo::nlp::QAConfig config;
config.engine_path = "assets/qa_bert.engine";
config.vocab_path = "assets/bert_vocab.txt";
xinfer::zoo::nlp::QuestionAnswering qa_pipeline(config);
std::string context = "xInfer is a C++ library designed for high-performance inference. It uses NVIDIA TensorRT to optimize models.";
std::string question = "What technology does xInfer use?";
auto result = qa_pipeline.predict(question, context);
std::cout << "Question: " << question << "\n";
std::cout << "Answer: " << result.answer << " (Score: " << result.score << ")\n";
}Config Struct: QAConfig
Input: std::string for question and std::string for context.
Output Struct: QAResult (contains the answer text, score, and position).
TextGenerator / CodeGenerator
Provides an interface for running generative Large Language Models (LLMs) for text or code completion.
Header: #include <xinfer/zoo/nlp/text_generator.h>
#include <xinfer/zoo/nlp/text_generator.h>
#include <iostream>
#include <string>
int main() {
xinfer::zoo::nlp::TextGeneratorConfig config;
config.engine_path = "assets/llama3_8b.engine";
config.vocab_path = "assets/llama_vocab.json";
config.max_new_tokens = 100;
xinfer::zoo::nlp::TextGenerator generator(config);
std::string prompt = "xInfer is a C++ library that enables ";
std::cout << "Prompt: " << prompt;
// The streaming function calls the lambda for each new piece of text generated.
generator.predict_stream(prompt, [](const std::string& token_str) {
std::cout << token_str << std::flush;
});
std::cout << std::endl;
}Config Struct: TextGeneratorConfig / CodeGeneratorConfig
Methods: .predict() for a single string, and .predict_stream() for real-time, token-by-token streaming.
