Building your own named entity recognition model in .NET
Imagine you’re reading a captivating novel or a breaking news article. As you move through the sentences, you effortlessly pick out the names of characters, places, organizations, and significant events. This ability to identify and categorize specific pieces of information in text comes naturally to us as humans. But how do we teach a computer to do the same? Enter Named Entity Recognition (NER), a fascinating area of natural language processing that focuses on teaching machines to locate and classify key elements in text.
Named entity recognition is all about enabling computers to read a piece of text and identify “named entities”, that is, the proper nouns and phrases that refer to specific objects like people, organizations, locations, dates, and monetary values. It’s like giving the machine a highlighter and asking it to mark all the important names and terms in a document. This process transforms unstructured text into structured data, making it more accessible for analysis and applications.
Why is NER so important? In our digital age, vast amounts of textual data are generated every second from news articles, social media posts, emails, and research papers. Extracting meaningful information from this sea of words is crucial for tasks like information retrieval, question answering, sentiment analysis, and more. NER serves as a foundational tool that helps in organizing and making sense of this data by pinpointing the most critical elements.
Training a NER model with ML.NET
Let’s now go through the process of creating our own NER-capable model in .NET by using ML.NET. The complete solution that we will be building is available via the link below:
https://github.com/fiodarsazanavets/mlnet-project-samples/tree/main/Named-entity-recognition
Also, before we do it, this example comes from my book, which I wrote to make it easy for developers who already know C# and .NET to get into machine learning. This book covers all the main fundamentals of machine learning and doesn’t solely focus on building LLMs. If you are interested, here’s where you can get your copy, which is currently in pre-release and is close to being completed:
Machine Learning for C# Developers Made Easy
Let’s continue. Let’s assume that you already have the knowledge of .NET fundamentals and you know how to create a new project and add NuGet packages to it. Otherwise, you will need to go through a C# tutorial before you will be able to understand the rest of this article.
First, we will create a separate console application project and call it NamedEntityRecognition. In ML.NET, NRE is enabled by integrating ML.NET with PyTorch. Therefore, we will need to install the Microsoft.ML.TorchSharp NuGet package in our project. Also, as we did before, we will need to install another NuGet package which will integrate the PyTorch functionality with the specific hardware and the operating system. For example, the package that will use a CUDA-enabled GPU on Windows will be TorchSharp-cuda-windows. Since we will be working with a CSV file, we will also need to install the CsvHelper package.
Next, we will add a class that will hold the input data. Its structure can be seen below:
namespace NamedEntityRecognition;
public class Input
{
public string Sentence { get; set; }
public string[] Label { get; set; }
}It has two properties: Sentence and Label. The Sentence property represents the original sentence. The Label property contains labels for each token inside the sentence. We are using a fairly simple example, so you can think of each word as a separate token.
Here is the structure of the output data:
namespace NamedEntityRecognition;
public class Output
{
public string[] Predictions;
}It has a string array property called Predictions. Each value in the array represents the label that the model predicted for each token in the original sentence.
Then, we will need the Label class shown below that will be used to populate the model with the unique label values in the format that we want to work with.
namespace NamedEntityRecognition;
public class Label
{
public string Key { get; set; }
}To create all the unique values that we need, we will add the LabelsHelper class, which looks like this:
namespace NamedEntityRecognition;
public static class LabelsHelper
{
public static Label[] GetLabels() => [
new Label { Key = "PERSON" }, // People
new Label { Key = "NORP" }, // Demographic groups
new Label { Key = "FAC" }, // Buildings
new Label { Key = "ORG" }, // Organizations
new Label { Key = "CITY" }, // Cities
new Label { Key = "COUNTRY" }, // Countries
new Label { Key = "CONTINENT" }, // Continents
new Label { Key = "LOCATION" }, // Geographic features
new Label { Key = "PRODUCT" }, // Objects
new Label { Key = "EVENT" }, // Events
new Label { Key = "WORK_OF_ART" }, // Works of art
new Label { Key = "LAW" }, // Legal documents
new Label { Key = "LANGUAGE" }, // Languages
new Label { Key = "DATE" }, // Date
new Label { Key = "TIME" }, // Time
new Label { Key = "PERCENT" }, // Percentage
new Label { Key = "MONEY" }, // Monetary vaue
new Label { Key = "QUANTITY" }, // Quantity measurements
new Label { Key = "ORDINAL" }, // Order
new Label { Key = "CARDINAL" }, // Other numbers
];
}This class has a static GetLabels() method. When invoked, it returns a collection of all unique labels supported by our model.
The bulk of our application logic will go into the TrainingDataProcessor class:
using CsvHelper.Configuration;
using CsvHelper;
using System.Globalization;
namespace NamedEntityRecognition;
public static class TrainingDataProcessor
{
public static IEnumerable<Input> LoadDataFromFile()
{
// Prompting the user to enter the path to the training data
Console.WriteLine(
"Please provide the path to the input file.");
// Validating against an empty input
string inputFilePath = Console.ReadLine() ?? string.Empty;
ArgumentException.ThrowIfNullOrWhiteSpace(inputFilePath);
// Validating against a non-existent file
if (!File.Exists(inputFilePath))
{
throw new ArgumentException(
"A file does not exist at the path specified");
}
// Configuring the CSV reader
var config =
new CsvConfiguration(
CultureInfo.InvariantCulture)
{
HasHeaderRecord = true,
TrimOptions = TrimOptions.Trim,
};
List<Input> inputs = [];
// Opening the CSV file reader
using (var reader = new StreamReader(inputFilePath))
using (var csv = new CsvReader(reader, config))
{
// Skipping the first row as it contains headers
csv.Read();
// Reading each row and extracting the data
while (csv.Read())
{
var sentence = csv.GetField(0);
var labelString = csv.GetField(1)?
.Replace("\"", string.Empty);
var labels = labelString?
.Trim([ '[', ']', '"' ])
.Split([',', ' '],
StringSplitOptions.RemoveEmptyEntries);
for (int i = 0; i < labels?.Length; i++)
{
labels[i] = labels[i].Trim('\'');
}
inputs.Add(new Input
{
Sentence = sentence!,
Label = labels!
});
}
}
return inputs;
}
}The main static method of the class, LoadDataFromFile(), prompts the user to enter the full path to the input CSV file. If the path is valid and the data in the file adheres to the expected format, it loads the data into a collection of Input records, ready for the model to be trained with.
Finally, we would have the Program.cs file, which acts as the entry point for our application, with the code shown below. The code loads the labels, loads the training data, and trains the model.
using Microsoft.ML.Data;
using Microsoft.ML;
using NamedEntityRecognition;
using Microsoft.ML.TorchSharp;
using Microsoft.ML.Transforms;
using Microsoft.ML.Tokenizers;
try
{
// Creating ML context
MLContext context = new()
{
FallbackToCpu = true,
GpuDeviceId = 0
};
// Loading the list of unique taken labels
IDataView labels =
context.Data.LoadFromEnumerable(
LabelsHelper.GetLabels());
// Loading training data from a file
IDataView dataView = context.Data.LoadFromEnumerable(
TrainingDataProcessor.LoadDataFromFile());
EstimatorChain<ITransformer> chain = new();
EstimatorChain<KeyToValueMappingTransformer> estimator =
chain.Append(
// Pre-processing the label data
context.Transforms.Conversion.MapValueToKey(
"Label", keyData: labels))
.Append(context.MulticlassClassification
// Performing the NER training step
.Trainers.NamedEntityRecognition(
outputColumnName: "Predictions"))
// Mapping numeric values to the human-readable predictions
.Append(context.Transforms.Conversion.MapKeyToValue("Predictions"));
// Training the model
using TransformerChain<KeyToValueMappingTransformer> transformer =
estimator.Fit(dataView);
// Selecting the statement to analyze
string sentence =
"Alice and Bob visited Paris in France and met with Microsoft";
PredictionEngine<Input, Output> engine =
context.Model.CreatePredictionEngine<Input, Output>(transformer);
// Making an NER prediction from the model
Output predictions = engine.Predict(new Input { Sentence = sentence });
Console.WriteLine(sentence);
// Printing the predicted labels for every word of the statement
Console.WriteLine(string.Join(", ", predictions.Predictions));
Console.ReadLine();
}
catch (Exception ex)
{
Console.WriteLine($"Error: {ex.Message}");
Console.ReadLine();
}We should now have everything ready to test our model.
Testing the model
To train the model, we can download the following CSV dataset:
The first few rows of the CSV file look as follows:
Sentence,Label
Alice and Bob live in the USA,"['PERSON', '0', 'PERSON', '0', '0', '0', ‘COUNTRY’]"
Charlie visited Paris in France,"['PERSON', '0', 'CITY', '0', 'COUNTRY']"
Emily works at Google in California,"['PERSON', '0', '0', 'ORG', '0', 'STATE']"
James met Laura in London,"['PERSON', '0', 'PERSON', '0', 'CITY']"
Sarah and John are friends,"['PERSON', '0', 'PERSON', '0', '0']"
Microsoft is a major company in the USA,"['ORG', '0', '0', '0', '0', '0', '0', ‘COUNTRY’]"
The Eiffel Tower is in Paris,"['0', 'LOCATION', 'LOCATION', '0', '0', 'CITY']"
Canada is a country in America,"['COUNTRY', '0', '0', '0', '0', 'CONTINENT']"
Anna is from Germany,"['PERSON', '0', '0', 'COUNTRY']"
Toyota is headquartered in Japan,"['ORG', '0', '0', '0', 'COUNTRY']"
Google and Facebook are competitors,"['ORG', '0', 'ORG', '0', '0']"The file consists of the following columns:
Sentence, which contains the free-text sentence.Label, which is a collection of token labels written in JSON format as a collection of strings. The labels correspond to each word in the original sentence. For unlabelled tokens, the placeholder value of0is used.
We can then launch the application, where we will be presented with the following prompt:
Please provide the path to the input file.It may take some time for the model to consume the data set and get trained. After the training is completed, it will print out the testing sentence, which is as follows:
Alice and Bob visited Paris in France and met with MicrosoftImmediately after this, it will print out a comma-separated list of predicted labels for every word in the sentence.
Please note that, at the time of writing, NER functionality is in preview, and some unexpected behavior may happen. For example, a known issue with it is that it can sometimes offset the predicted labels by one element, even when it predicts them correctly. For example, instead of the expected result of
PERSON, , PERSON, , CITY, , COUNTRY, , , , ORGYou may see the following result:
, PERSON, , PERSON, , CITY, , COUNTRY, , , ,The labels are correct, other than the fact that they start at the index of 1 instead of 0. However, if we move them leftwards, they will fit the positions of the words in the original sentence.
Wrapping up
This completes our exercise on building our own NER model. If there’s a popular demand for it, I can write another article, describing how NER works and what practical applications of it are in day-to-day usages of LLMs and AI agents.
Right now, I am frequently being asked what skills one needs to develop to become an AI engineer. This is what I will talk about next time, so watch this space!


