Why is Anaconda the most widely used Data Science platform for Python users?

It's simple. Anaconda allows you to easily create separate and isolated environments with their own version of Python and Python packages. Therefore you can have Python 2.7 and Python 3 both on the same machine without them interfering with each other. You can also install data science packages like NumPy in one environment but not another giving you full control over your environments.

To get started we need to first install the program. There are essentially two versions:


Anaconda is the full distribution which means it comes with many frequently used packages pre-installed. Therefore it is a bit larger out of the box and takes longer to download (3 GB and a few minutes). However, if you have plenty of disk space then just download Anaconda and save yourself the hassle of individually downloading packages later. You can find a list of all the pre-installed packages here.


You can think of Miniconda as a "Lite" version. It contains only python and the package installer. Once you install Miniconda you can install all of the individual packages you need. Miniconda is a great option if you don't have a lot of disk space or you want to get up and running quickly.


On the initial download screen, you will see a Python 2.7 version and Python 3 version. This is essentially the default version of python you would like to use, so download the version you will be using most frequently. Of course, you will be able to switch versions easily in the future but downloading the most frequently used version just makes your life easier. If you don't know which version of Python you need, just download the most recent version.

For Anaconda, you can leave most of the defaults as is and click through until you reach the final Advanced Options page which looks like this:

Anaconda Options

It's recommended that you check off both options but if you don't want this Python version as your default then just leave the second option unchecked.