Language Translation using Hugging Face and Python in 3 lines of code

Table of Contents
Primary Item (H2)

Learn to perform language translation using the transformers library from Hugging Face in just 3 lines of code with Python.

The transformers library provides thousands of pre-trained models to perform tasks on texts such as classification, information extraction, question answering, summarization, translation, text generation, and more in over 100 languages. Its aim is to make cutting-edge NLP easier to use for everyone.

First, let us install the transformers library and its dependencies for language translation,

pip install transformers sentencepiece -q

Next, importing the pipeline function from the transformers library,

# Importing the pipeline function from the transformers library
from transformers import pipeline
Pipeline method for Language Translation

The pipeline method is responsible for:

  • Pre-processing: Converting raw text input to numerical input for a given pre-trained model
  • Model Inference: Making a prediction using a pre-trained model
  • Post-processing: Converting prediction to a proper output
# Creating a Text2TextGenerationPipeline for language translation
pipe = pipeline(task='text2text-generation', model='facebook/m2m100_418M')

M2M100 is a multilingual encoder-decoder (seq-to-seq) model trained for Many-to-Many multilingual translation. The model can directly translate between 100 different languages without relying on English data. You can learn more about it from Facebook's blog post.

# Converting 
pipe("That is a flower", forced_bos_token_id=pipe.tokenizer.get_lang_id(lang='hi'))

Here, to force the target language id as the first generated token, we pass the forced_bos_token_id parameter.


Language Translation using Hugging Face and Python in 3 lines of codeLanguage Translation using Hugging Face and Python in 3 lines of code

Do you want to learn Python, Data Science, and Machine Learning while getting certified? Here are some best selling Datacamp courses that we recommend you enroll in:

  1. Introduction to Python (Free Course) - 1,000,000+ students already enrolled!
  2. Introduction to Data Science  in Python- 400,000+ students already enrolled!
  3. Introduction to TensorFlow for Deep Learning with Python - 90,000+ students already enrolled!
  4. Data Science and Machine Learning Bootcamp with R - 70,000+ students already enrolled!
Written by
The Click Reader
At The Click Reader, we are committed to empowering individuals with the tools and knowledge needed to excel in the ever-evolving field of data science. Our sole focus is delivering a world-class data science bootcamp that transforms beginners and upskillers into industry-ready professionals.

Interested In Data Science Bootcamp?
Request more info now.

Lead Collection Form
linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram