On this page

Contributor Guide

🛈  Warning
This is a draft document.

Welcome to the Contributor Guide! Here you will find useful resources that will help you start contributing to the Scientific Python ecosystem.

First steps

Why contribute

Learn some of the reasons why contributing to open source Scientific Python is impactful and can be a transformative experience for developers!

Your time is the most valuable thing you have, so when taking part in volunteer activities it is always worth asking “why?”. Here are a few reasons:

Advance science

First, Scientific Python is about science. And if you believe that science makes the world better, then improving scientific tooling is extremely important! By putting better open source tools in the hands of researchers, we can help them to produce accurate results, do so in a transparent way, while also improving reproducibility. We believe that scientific tools should be open, and that they should belong to those who use them.

Make an impact

Second, Scientific Python is about openness. When you take part in building open source software, your work may be used by thousands, sometimes millions of people. Your software may only be a tiny cog in a big machine, but it could help fly the next space mission, decipher the origins of the universe, or help invent radically new medical treatments. That is real impact!

Grow as a developer

Of course, it’s not just about science and researchers, but also about you: the volunteer contributor. And as we said before, we strongly believe that scientific tools should be developed and owned by those that use them. This is the best way to ensure that tools meet the needs of science.

But, even if you are not a scientist, you can contribute and benefit from contributing.

Being part of the open source community, you will work with some of the very best programmers in the world. Through their feedback, you will become a better developer and also learn how to be an excellent collaborator and team member. You’ll learn best practices of software development and engineering, and how to best present and communicate your ideas.

Last, but not least, you will likely work with and make friends with people from around the globe!

Conclusion

These are but a few of the reasons why we contribute to open source Scientific Python.

Shaping the tools you and others use has been a transformative experience for many of us, and we hope it will be for you too.

We cannot wait to welcome you to the Scientific Python community!

Ways to contribute

Learn some of the ways you can contribute to open source Scientific Python projects without having to code.

5 ways to contribute to Scientific Python without coding

Scientific Python is code designed by scientists and engineers for science and engineering. All projects have a straightforward license that determines what you can and cannot do; typically, you may use and modify the software, as long as you give credit to the original authors. The entire ecosystem relies on peer review and community production, so your contribution is really important. There are many ways to contribute outside of coding—we’ll discuss a few.

Issue testing and triaging

Every Scientific Python project has its own issue tracker where users report bugs, suggest UX improvements, and discuss technical problems they are having. This lets developers support users and track improvement to the projects. One way in which you can contribute is by verifying and triaging issues.

For example:

  1. You can check if older issues are still relevant and let developers know if they were solved.
  2. You can find duplicate issues and link related ones, since usually the same problems are reported multiple times.
  3. You can add self-contained code snippets that reproduce issues, thereby helping developers to find the problem and test solutions.
  4. You can label issues by adding appropriate tags. This usually requires triage rights, but you can just ask for them!
Reviewing PRs

Pull requests (PRs) are the way in which Scientific Python projects incorporate new code. You can help, even if you’re not familiar with them, by:

  1. Summarizing discussions in PRs so that newcomers to the conversation can easily catch up without reading the whole thread.
  2. Testing proposed changes in PRs to make sure they function correctly and don’t break existing code.
Improving documentation

Another way to contribute to a project is by improving it’s documentation. Documentation is crucial for every Scientific Python project since that is the way users learn how to use it. This doesn’t mean you need to write new documentation (which you can by following the docs contributing guide)—there are other ways you can help too.

  1. Many projects have tutorials which you can review and report confusing or missing parts.
  2. Find typos and minor errors in docs and report them in the docs repository issue tracker.
  3. If you feel like creating your own content, you can write your own guides and tutorials. There are several categories of materials you can produce: how-tos, deep-dive explanations, gallery examples, notebooks, videos, etc.
  4. Apart from content, you can also improve docs organization and style.
Translations

Most Scientific Python projects are developed in English, but an increasing number use online platforms such as Crowdin to translate their interface, webpage, and documentation. If you speak a language other than English and feel comfortable translating, this is yet another way you can help.

Participating in the community

Every Scientific Python project has a community of volunteers that you can be part of. You can get involved in online conversations and discussions about the projects, offer help to newcomers, come to community meetings, or teach others about the project. You can even help with community outreach by sharing content on Twitter, organizing code sprints, posting newsletter updates, or writing blogs.

As you’ve seen, there are many ways to contribute to Scientific Python! No matter what you have to offer, go ahead reach out to project maintainers: they will be happy to receive all the help they can get.

Choosing a project

Learn how to choose a project to start contributing to the Scientific Python Ecosystem.

[DRAFT] This video has not been recorded yet.

Contributing to a Scientific Python project

A common question from new contributors is: “how do I choose which project to contribute to”? Some people end up contributing to many different projects, while others tend to focus their effort on a single project. And while projects in the ecosystem have a lot in common, each has it’s own community—so there may be differences in culture, style, and decision-making processes. Ultimately, which projects you contribute to will depend a lot on your own personal interests and goals.

Understanding the project

Some projects like NumPy are used by many projects in the ecosystem. Such projects are mature and relatively full-featured. Given their central role in the ecosystem, working on these projects can have a huge impact. However, making changes to these projects may be more challenging than in newer and less central projects. It’s not uncommon even for core developers to have their pull requests go through iterations for months before being merged.

For example, because NumPy affects pretty much the entire ecosystem, it is going to be very difficult to contribute larger features to and usually requires a NumPy Enhancement Proposal (NEP) to be approved before work is started on it. Enhancement Proposals are fairly common for core projects in the ecosystem and consist of a writeup of the planned changes, including a summary of the implementation, pros and cons of it, and sometimes a proof of concept coded up. It is then discussed and iterated on before a decision is made.

On the other hand, projects such as NetworkX may just require a review or two and basic tests before your changes are merged.

It’s worth remembering this distinction when deciding how much time you’d like to invest.

Contributing to the ecosystem

The open source Scientific Python community functions differently from a normal work environment because it is largely comprised of people contributing in their free time, from different time zones. As such, contributors and maintainers may not always be able to get back to you immediately.

Since so many community members are volunteers, any and all contributions are highly valued. Maintainers always want to help, but they are often over-subscribed and may miss notifications or read something and forget to respond. If you haven’t heard back from them in a few days, it’s usually safe to give them a friendly ping to check.

Learn more about a project

Getting to know the developer community is a great way to learn more about the projects and find a great fit. There are many ways to begin interacting with project communities:

  • Most projects have a developer mailing list or discussion forum.
  • Some projects also have real-time chat for developers and newcomers, such as Slack or Discord.
  • Projects often have weekly or monthly online community meetings.
  • You can also watch a project’s development activity on GitHub.
  • Finally, you can join the Scientific Python discussion forum, where we bring together users and developers from multiple projects across the ecosystem.

Getting started

Learn the first steps to contribute to open source Scientific Python.

[DRAFT] This video has not been recorded yet.

Choosing a project to work on

How should you choose which project to work on? There are many projects in the ecosystem to choose from so it’s important to find one related to something you’re interested in or is a project you already use. For example, if you’re interested in working with images, it might be worth looking into implementing algorithms in scikit-image.

Typically, it is easier to contribute to smaller projects—but you also want to choose a project that’s active enough so that the developers can review your code and provide mentorship. There may also be more issues and ideas to work on.

Before diving into a project, take a look at their open issues and pull requests, see how maintainers interact with the community, and decide if it would be good fit for you.

For a more detailed discussion, also take a look at our Choosing a project video, linked below.

Tools to learn

As with any trade, there are certain fundamental tools you should learn. Since the ecosystem is built in Python, you’ll need to know how to program in that language. Other tools we use daily include:

  • git and Github
  • the command-line terminal, and
  • a good editor.

Take a look below for links on how to learn these tools.

First contribution

Now that you’ve chosen a project to contribute to, it’s time to get set up. Most projects have a file called CONTRIBUTING in the root of the repository that will tell you how to set up your development environment, propose changes, etc. Developer documentation will also explain testing and review procedures, and whatever else you need to know.

When first contributing to a project, it’s best to start with small, self-contained issues. Often, maintainers will label issues with the “good first issue” label, so take a look at those first. Examples of a good first issue include fixing a small bug, adding tests, fixing documentation typos, or writing up simple documentation.

It is not uncommon to get stuck while making your first contribution. Don’t panic! Try to find the real-time chat or discussion forum for the project, and ask for assistance there. The maintainers will be happy to help!

For more details, also check out our First contribution video.

What’s next

Once you’re comfortable making small changes to the project, you can start taking on bigger features. There are many different ways to help: you may, e.g., implement new features, write documentation, refactor and clean up code, improve testing, work on build infrastructure, and so forth.

No matter what you contribute, or whether you contribution is big or small, it is much appreciated.

First contribution

Start working on your first contribution to open source Scientific Python.

[DRAFT] This video has not been recorded yet.

How to make your first contribution to open source?

Before you start, make sure you have the following:

  • A GitHub account
  • A terminal or command line
  • An editor or IDE
  • Git installed in your computer
  • Conda installed in your computer

There are some links below the video to help you get these elements ready in case you are missing some.

Now, we can get started.

Step 1: Fork the projects repo

Go to the project’s repository and click the “Fork” button at the top left of the page. This will create a copy of the repository in your own account.

Step 2: Clone your fork

On your new fork, click the green “Code” button and copy the link that appears there to get the URL for cloning it.

Now, open your terminal (or Git Bash, if you’ve installed Git for Windows) and type the command ‘git clone’ followed by pasting the URL you just copied. With this, you now have a local copy of your fork.

Finally, change to the directory of the repo you just cloned and add the the project’s repo as the “upstream” remote repository by typing the following:

git remote add upstream https://github.com/.git
Step 3: Set up your development environment

Most open source projects have their own contributing guide, which explains the steps needed for setting up your development environment. You’ll usually find them in the root directory of the repo. We recommend that you create a new environment for this.

To create and activate a new Conda environment, type the following commands in your terminal (or Anaconda Prompt on Windows):

conda create -n [NAME] python=3
conda activate  [NAME]

After you have created your new Conda environment, you need to install the project’s necessary dependencies (This depends on which project we will be using for this video):

conda install …
Step 4: Pick an issue

Now we need to select the issue we want to fix on the issues tab (Add link of Project’s issue tracker to display in video here) issue tracker of the repository and reproduce it in the development version of our project. (Not sure this applies, again it depends on the project).

Step 5: Create a new branch for your changes

First create a branch for your work. Run the following command in your command line:

git checkout -b [BRANCH NAME]
Step 6: Find the file and make the changes

Open your editor or IDE in the file that you need to solve the issue and save your changes.

Step 7: Confirm/test that the issue is solved in dev mode

(Not sure this applies)

Step 8: Commit your changes

Now, you are ready to add and commit your changes with a descriptive message. Type the following command in your terminal:

git commit -a -m “descriptive message”

Finally, push your new branch with your changes to your fork on GitHub:

git push -u origin [BRANCH NAME]

Enter your GitHub username and password if requested.

Step 9: Open PR

Now, you can submit your changes to the project’s repo.

Go to the project’s repository on Github, and you will see the option to open a Pull Request. You also have to make sure that you select the correct branch to merge your changes.

You have now made your first contribution to open source!

Getting set up

Ecosystem

Learn how the Scientific Python ecosystem is composed and some of its main packages.

[DRAFT] This video has not been recorded yet.

The Scientific Python ecosystem is a collection of open-source scientific software packages written in Python. It is a broad and ever-expanding set of algorithms and data structures that grew around NumPy, SciPy, and matplotlib.

The ecosystem includes a wide variety of tools: some more specialized to specific domains such as biological imaging or astronomy, and others quite general for tasks such as data management and high-performance computing.

It includes projects such as Pandas (for data analysis), NetworkX (for graph computation), scikit-learn (for machine learning), and scikit-image (for image processing).

Ecosystem Packages

Here is a curated selection of packages available in the ecosystem:

Core
  • NumPy, the fundamental package for numerical computation. NumPy defines the n-dimensional array data structure, the most common way of exchanging data within packages in the ecosystem.
  • SciPy, a collection of numerical algorithms and domain-specific toolboxes, including signal processing, optimization, statistics, and much more.
  • Matplotlib, a mature and popular plotting package that provides flexible, publication-quality 2-D and 3-D visualization.
Data and computation
  • pandas, providing high-performance, easy-to-use data structures.
  • SymPy, for symbolic mathematics and computer algebra.
  • NetworkX, is a collection of tools for analyzing complex networks.
  • scikit-image is a collection of algorithms for image processing.
  • scikit-learn is a collection of algorithms and tools for machine learning.
Productivity and high-performance computing
  • IPython, a command-line interface to Python, for interactively exploring code, processing data, and testing code ideas.
  • Jupyter Lab provides computational notebooks that combine interactive code with descriptive text in your web browser, useful especially for teaching and documenting research.
  • Joblib, Dask, or Ray for distributed processing with a focus on numerical data.

Install

Learn the tools' intallation process in order to start contributing to the Scientific Python ecosystem.

[DRAFT] This video has not been recorded yet.

Before installing Scientific Python libraries, you need to have Python itself installed. There are two, largely equivalent, ways of doing that, and we describe both below.

If you have a working version of Python on your system already (check by running python3), you can skip to setting up a virtual environment.

Segment 1: Python.org

This is the official Python distribution, which uses the pip package manager. pip installs packages from Python Package Index, or PyPI for short.

Download the installer from https://www.python.org/downloads/.

Set up a virtual environment

A virtual environment is a workspace into which you can install Python libraries, separate from what is being used by your operating system.

Create a new virtual environment in a directory called py3:

python -m venv py3

Start using it as follows:

source py3/bin/activate

Also, make sure you have pip installed—that is Python’s default package manager:

python -m ensurepip

You are now ready to install Scientific Python packages using pip! For example:

pip install ipython numpy scipy

You should now be able to run IPython (the interactive Python shell) to try out NumPy:

$ ipython

In [1]: import numpy as np

In [2]: np.linspace(0, 10, 5)
Out[2]: array([ 0. ,  2.5,  5. ,  7.5, 10. ])

Segment 2: Mambaforge

Mambaforge is a small Python distribution based around the mamba package manager, and installs packages from the community repository conda-forge.

Mamba is a bit different from Python’s pip package manager in that it can, in addition to Python libraries, also install compilers, libraries, and so forth.

Download the latest version from GitHub. Run the installer, and when it asks you “Do you wish the installer to initialize Mambaforge?” enter “yes”.

Set up a virtual environment

A virtual environment is a workspace into which you can install Python libraries, separate from what is being used by your operating system.

Create a new virtual environment in a directory called py3:

mamba create -p py39

Mamba uses conda to switch between virtual environments. Start using the new environment as follows:

conda activate ./mamba39

You are now ready to install Scientific Python packages using mamba! For example:

mamba install ipython numpy scipy

You should now be able to run IPython (the interactive Python shell) to try out NumPy:

$ ipython

In [1]: import numpy as np

In [2]: np.linspace(0, 10, 5)
Out[2]: array([ 0. ,  2.5,  5. ,  7.5, 10. ])

Next Steps

Start exploring some of the packages from the Scientific Python ecosystem.

[DRAFT] This video has not been recorded yet.

Scientific Python is built on the Python programming language. Using Scientific Python therefore requires having a firm grasp of Python itself. We suggest reading through the official tutorial, doing an online tutorial on exercism, or using any of the countless resources that exist online or in print.

Learning a new language can be challenging, but Python is fun—so keep trying and hang in there! The community is there to help you along the way.

So let’s cover some basics.

How to run your Python code

Python is an interpreted language: that means that it reads a text file with instructions and executes those one by one.

The easiest way to create a text file is in a text editor, like Spyder or VSCode. We can do that right now. Let’s create a file called hello.py:

print("Hello world")

And then run it:

python hello.py
hello

That’s it, your first Python program!

You can also play around with Python code interactively in IPython:

[launch IPython and run:]

In [1]: def fibonacci(n):
   ...:     a, b = 0, 1
   ...:     for i in range(0, n):
   ...:         a, b = b, a + b
   ...:     return a
   ...:

In [2]: fibonacci(10)
Out[2]: 55

Another ways to play with Python code is in Jupyter Lab. This is an interactive web application for typing in and executing Python code. Let me show you how to do a simple plot in Jupyter:

[Open Jupyter Lab; create notebook; import matplotlib as plt; plt.plot([1, 2, 3])]

You can head over to https://try.jupyter.org to test it out.

Hello NumPy

What distinguishes most scientific codes from general ones is that they operate on collections of numbers. These are often represented as NumPy arrays—they are fast, and they have convenient syntax.

Let’s generate 1000 random numbers and square them:

[In IPython]

import numpy as np
import matplotlib.pyplot as plt

### Generate 1000 random numbers, store in x
x = np.random.random(size=1000)

### Square them and store in y
y = x**2

### Plot the results!
plt.plot(x, y)
plt.show()

Learn more!

We’ll post a list of links below the video where you can learn more:

By far the best way to learn, however, is to start coding!

Stuck?

The first thing to do when stuck is to read the documentation. Note that almost all libraries ship with documentation right at your fingertips!

[illustrate how to look up the docstring for np.linspace]

If you are still stuck, join the community forum at https://discuss.scientific-python.org or reach out to the relevant package on its mailing list.

Good luck!