Scrape Instagram data with Python

12 May 2023
Francesco Cognolato
Founder

If you are working in marketing, public relations or as a content creator, you’ve certainly discovered how crucial it is to have a data-driven understanding of what is going on across the major social media. Together with other long-standing platforms such as Youtube and Twitter, or rising stars like TikTok, Instagram is one of the most used social media and is a critical source of data for marketers.

However, extracting and processing large volumes of data in a reliable and consistent way from Instagram is not so simple! In fact, people usually end up paying for very expensive APIs, or failing to build something solid.

In this blog post, we show how we can use EnsembleData Instagram API to extract data from Instagram in 3 lines of code and how these simple APIs can unlock infinite use-cases such as brand monitoring or influencer analysis.

What are EnsembleData’s APIs?

In short, we offer a set of APIs to extract data from all the major social media, such as TikTok, Instagram and Youtube. All you need to do is make an API call and our API will do all the hard work of scraping and retrieving the requested data for you. In other words, it is a direct and efficient way to extract data across different social media at scale.

You can register for free on EnsembleData’s platform, to start using the API for free. In order to use the APIs, you just need your personal access token. Once registered, you can find it at the top of the column on the left:

EnsembleData dashboard
Preview of the dashboard. Shown in the top left is the personal API token used to authenticate requests to the EnsembleData API

To activate the token you’ll first need to verify your email address using the link sent to you upon sign up!

Using the Python library to make a request

Once you’ve registered and have got your token, you can start using the API!

One method is to directly send the requests to EnsembleData; in the documentation there are examples on how to perform the calls. Alternatively, you can use our Python package. This provides a simpler and easier interface to send the requests.

Let’s take a look now at a very simple example. Suppose you want to get a user’s recent posts, for example, to monitor what kind of content is being posted and its performance.

The first step is to set up a virtual environment and install the package:

python3 -m venv .venv
source .venv/bin/activate

pip install ensembledata

Note: the package currently requires Python 3.7+.


The main component is the class EDClient. This class provides an entrypoint to make requests to any of the EnsembleData endpoints. When creating an instance of the EDClient, we pass in the API token (which we got in the previous step).

from ensembledata.api import EDClient

client = EDClient("INSERT API TOKEN HERE")

We are now ready to send the request! Now we’ll call the method, passing ‘user_id’ as input (in this case cristiano’s ‘user_id’) and any other (required or optional) parameters, in this case the ‘depth’ (to control how many posts to retrieve) and the ‘oldest_timestamp’ (unix timestamp) :

result = client.instagram.user_posts(user_id="173560420", depth=2, oldest_timestamp=1611308425)

As simple as that! We just got all the data available for most recent 20 posts of the user “cristiano”. Let’s say we also wanted some more detailed information about the user. For example whether it’s verified, a business account or a professional profile.

We can also do that very easily with the detailed-info endpoint:

# Through the Python library
result = client.instagram.user_detailed_info(username="cristiano")

# Or directly sending the request
import requests
root = "https://ensembledata.com/apis"
endpoint = "/instagram/user/detailed-info"
params = {
    "username": "cristiano",
    "token": TOKEN
}
res = requests.get(root+endpoint, params=params)

And that is it for today! If you have any question or are unsure about anything, contact us, or send us a message at [email protected].