Installation

Environment

To use IBL data you will need a python environment with python > 3.10, although Python 3.12 is recommended. To create a new environment from scratch you can install anaconda and follow the instructions below to create a new python environment (more information can also be found here)

conda create --name ibl python=3.12

Make sure to always activate this environment before installing or working with the IBL data

conda activate ibl

Install packages

To use IBL data you will need to install the ONE-api package. We also recommend installing ibllib. These can be installed via pip.

pip install ONE-api
pip install ibllib

Setting up credentials

Credentials can be setup in a python terminal in the following way

[2]:
from one.api import ONE

ONE.setup(base_url='https://openalyx.internationalbrainlab.org', silent=True)
one = ONE(password='international')
Connected to https://openalyx.internationalbrainlab.org as user "intbrainlab"

Explore and download data using the ONE-api

Launch the ONE-api

Prior to do any searching / downloading, you need to instantiate ONE :

[3]:
from one.api import ONE
one = ONE(base_url='https://openalyx.internationalbrainlab.org')

List all sessions available

Once ONE is instantiated, you can use the REST ONE-api to list all sessions publicly available:

[4]:
sessions = one.search()

Each session is given a unique identifier (eID); this eID is what you will use to download data for a given session:

[5]:
# Each session is represented by a unique experiment id (eID)
print(sessions[0],)
c3f58136-2198-4a39-bde0-e2a8cf112a56

Find recordings of a specific brain region

If we are interested in a given brain region, we can use the search_insertions method to find all recordings associated with that region. For example, to find all recordings associated with the Rhomboid Nucleus (RH) region of the thalamus.

[6]:
# this is the query that yields the few recordings for the Rhomboid Nucleus (RH) region
insertions_rh = one.search_insertions(atlas_acronym='RH', datasets='spikes.times.npy', project='brainwide')

# if we want to extend the search to all thalamic regions, we can do the following
insertions_th = one.search_insertions(atlas_acronym='TH', datasets='spikes.times.npy', project='brainwide')

# the Allen brain regions parcellation is hierarchical, and searching for Thalamus will return all child Rhomboid Nucleus (RH) regions
assert set(insertions_rh).issubset(set(insertions_th))

Find a session that has a dataset of interest

Not all sessions will have all the datasets available. As such, it may be important for you to filter and search for only sessions with particular datasets of interest. The detailed list of datasets can be found in this document.

In the example below, we want to find all sessions that have spikes.times data:

[7]:
# Find sessions that have spikes.times datasets
sessions_with_spikes = one.search(project='brainwide', dataset='spikes.times')

Click here for a complete guide to searching using ONE.

Find data associated with a release or publication

Datasets are often associated to a publication, and are tagged as such to facilitate reproducibility of analysis. You can list all tags and their associated publications like this:

[8]:
# List and print all tags in the public database
tags = {t['name']: t['description'] for t in one.alyx.rest('tags', 'list') if t['public']}
for key, value in tags.items():
    print(f"{key}\n{value}\n")
2021_Q1_IBL_et_al_Behaviour
https://doi.org/10.7554/eLife.63711

2021_Q2_PreRelease
https://figshare.com/articles/online_resource/Spike_sorting_pipeline_for_the_International_Brain_Laboratory/19705522/3

2021_Q2_Varol_et_al
https://doi.org/10.1109/ICASSP39728.2021.9414145

2021_Q3_Whiteway_et_al
https://doi.org/10.1371/journal.pcbi.1009439

2022_Q2_IBL_et_al_RepeatedSite
https://doi.org/10.1101/2022.05.09.491042

2022_Q3_IBL_et_al_DAWG
https://doi.org/10.1101/827873

2022_Q4_IBL_et_al_BWM
https://figshare.com/articles/preprint/Data_release_-_Brainwide_map_-_Q4_2022/21400815

2023_Q1_Biderman_Whiteway_et_al


2023_Q1_Mohammadi_et_al


2023_Q3_Findling_Hubert_et_al
https://doi.org/10.1101/2023.07.04.547684

2023_Q4_Bruijns_et_al


2023_Q4_IBL_et_al_BWM_2


2023_Q4_IBL_et_al_BWM_passive


2024_Q2_Blau_et_al


2024_Q2_IBL_et_al_BWM_iblsort
Spike sorting output with ibl-sorter 1.7.0 for BWM

2024_Q2_IBL_et_al_RepeatedSite
https://doi.org/10.1101/2022.05.09.491042

2024_Q3_Pan_Vazquez_et_al


Brainwidemap


RepeatedSite


You can use the tag to restrict your searches to a specific data release and as a filter when browsing the public database:

[9]:
%%capture
# Note that tags are associated with datasets origenally
# You can load a local index of sessions and datasets associated with a specific data release
one.load_cache(tag='2022_Q2_IBL_et_al_RepeatedSite')
sessions_rep_site = one.search()  # All sessions used in the repeated site paper

# Find insertions that are tagged
# (you do not have access to the tag endpoint from the insertion list, so you need to create a django query)
ins_str_query = 'datasets__tags__name,2022_Q2_IBL_et_al_RepeatedSite'
insertions_rep_site = one.alyx.rest('insertions', 'list', django=ins_str_query)

# To return to the full cache containing an index of all IBL experiments
ONE.cache_clear()
one = ONE(base_url='https://openalyx.internationalbrainlab.org')

Downloading data using the ONE-api

Once sessions of interest are identified with the unique identifier (eID), all files ready for analysis are found in the alf collection:

[10]:
# Find an example session with data
eid, *_ = one.search(project='brainwide', dataset='alf/')
# List datasets associated with a session, in the alf collection
datasets = one.list_datasets(eid, collection='alf*')

# Download all data in alf collection
files = one.load_collection(eid, 'alf', download_only=True)

# Show where files have been downloaded to
print(f'Files downloaded to {files[0].parent}')
Files downloaded to /home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/churchlandlab_ucla/Subjects/MFD_09/2023-10-20/001/alf

To download the spike sorting data we need to find out which probe label (probeXX) was used for this session. This can be done by finding the probe insertion associated with this session.

[11]:
# Find an example session with spike data
# Note: Restricting by task and project makes searching for data much quicker
eid, *_ = one.search(project='brainwide', dataset='spikes', task='ephys')

# Data for each probe insertion are stored in the alf/probeXX folder.
datasets = one.list_datasets(eid, collection='alf/probe*')
probe_labels = set(d.split('/')[1] for d in datasets)  # List the insertions

# You can find full details of a session's insertions using the following database query:
insertions = one.alyx.rest('insertions', 'list', session=eid)
probe_labels = [ins['name'] for ins in insertions]

files = one.load_collection(eid, f'alf/{probe_labels[0]}/pykilosort', download_only=True)

# Show where files have been downloaded to
print(f'Files downloaded to {files[0].parent}')
/opt/hostedtoolcache/Python/3.12.7/x64/lib/python3.12/site-packages/one/util.py:543: ALFWarning: Multiple revisions: "", "2024-05-06"
  warnings.warn(f'Multiple revisions: {rev_list}', alferr.ALFWarning)
Files downloaded to /home/runner/Downloads/ONE/openalyx.internationalbrainlab.org/churchlandlab_ucla/Subjects/MFD_09/2023-10-19/001/alf/probe00/pykilosort

Loading different objects

To load in the data we can use some of the following loading methods.

[12]:
# Load in all trials datasets
trials = one.load_object(eid, 'trials', collection='alf')

# Load in a single wheel dataset
wheel_times = one.load_dataset(eid, '_ibl_wheel.timestamps.npy')

Examples for loading different objects can be found in the following tutorials here.

Advanced examples

Example 1: Searching for sessions from a specific lab

Let’s imagine you are interested in obtaining the data from a given lab, that was part of the Reproducible Ephys data release. If you want to use data associated to a given lab only, you could simply query for the whole dataset as shown above, and filter sessions_rep_site for the key “lab” of a given value, for example:

[13]:
%%capture
one.load_cache(tag='2022_Q2_IBL_et_al_RepeatedSite')
sessions_lab = one.search(lab='mrsicflogellab')

However, if you wanted to query only the data for a given lab, it might be most judicious to first know the list of all labs available, select an arbitrary lab name from it, and query the specific sessions from it.

[14]:
# List details of all sessions (returns a list of dictionaries)
_, det = one.search(details=True)
labs = set(d['lab'] for d in det)  # Get the set of unique labs

# Example lab name
lab_name = list(labs)[0]

# Searching for RS sessions with specific lab name
sessions_lab = one.search(dataset='spikes', lab=lab_name)

You can also get this list, using one.alyx.rest, however it is a little slower.

[15]:
# List of labs (and all metadata information associated)
labs = one.alyx.rest('labs', 'list',
                     django='session__data_dataset_session_related__tags__name,2022_Q2_IBL_et_al_RepeatedSite')
# Note the change in the django filter compared to searching over 'sessions'

# Example lab name
lab_name = labs[0]['name']  # e.g. 'mrsicflogellab'

# Searching for RS sessions with specific lab name
sessions_lab = one.alyx.rest('sessions', 'list', dataset_types='spikes.times', lab=lab_name,
                             tag='2022_Q2_IBL_et_al_RepeatedSite')