Learn to perform language translation using the transformers library from Hugging Face in just 3 lines of code with Python.
The transformers library provides thousands of pre-trained models to perform tasks on texts such as classification, information extraction, question answering, summarization, translation, text generation, and more in over 100 languages. Its aim is to make cutting-edge NLP easier to use for everyone.
First, let us install the transformers
library and its dependencies for language translation,
pip install transformers sentencepiece -q
Next, importing the pipeline
function from the transformers library,
# Importing the pipeline function from the transformers library from transformers import pipeline
The pipeline method is responsible for:
- Pre-processing: Converting raw text input to numerical input for a given pre-trained model
- Model Inference: Making a prediction using a pre-trained model
- Post-processing: Converting prediction to a proper output
# Creating a Text2TextGenerationPipeline for language translation pipe = pipeline(task='text2text-generation', model='facebook/m2m100_418M')
M2M100 is a multilingual encoder-decoder (seq-to-seq) model trained for Many-to-Many multilingual translation. The model can directly translate between 100 different languages without relying on English data. You can learn more about it from Facebook’s blog post.
# Converting pipe("That is a flower", forced_bos_token_id=pipe.tokenizer.get_lang_id(lang='hi'))
Here, to force the target language id as the first generated token, we pass the forced_bos_token_id
parameter.
Do you want to learn Python, Data Science, and Machine Learning while getting certified? Here are some best selling Datacamp courses that we recommend you enroll in:
- Introduction to Python (Free Course) - 1,000,000+ students already enrolled!
- Introduction to Data Science in Python- 400,000+ students already enrolled!
- Introduction to TensorFlow for Deep Learning with Python - 90,000+ students already enrolled!
- Data Science and Machine Learning Bootcamp with R - 70,000+ students already enrolled!