Fragalysis User Guide

Fragalysis is a web-based platform for the visualisation, comparison, and analysis of fragment-bound protein crystal structures, assay measurements, and follow-up virtual ligand screens. It can effectively be divided into:

Experimental fragment screening data processed via XChem Align and uploaded to Fragalysis, curated and downloaded via the “left-hand side” (LHS) of Fragalysis.

Computed follow-up designs from virtual compound sets uploaded to Fragalysis, curated and downloaded via the “right-hand side” (RHS) of Fragalysis.


Getting started

Fragalysis can be used to explore data in a number of ways:

Experimental Structures (LHS)

Computed Structures (RHS)

Jupyter Notebooks

Programming Interface (API)


The Fragalysis “viewer” interface

The Fragalysis viewer has been customised for fragment screening workflows, is fully interactive and runs directly in your browser. When opening a target, you will be presented with an interface that allows you to interact with and curate data:

lhs
  • Share/snapshot this allows you to create and share a permanent link to your exact Fragalysis state

  • Tags This is how you can control which hits are visible by sites and other categories

  • LHS / Hits Here you can navigate all the hits and add visualisations to them (The Tags panel also belongs to the LHS)

  • The visualisation buttons are shared also with virtual hits (RHS) and work as follows:

    • All : show ligand in (CPK), protein side chains (lines), and interactions.

    • Ligand: Ligand (CPK)

    • Protein: Protein side chains (lines)

    • Interactions: Interactions

    • Surface: Electrostatic surface of the protein

    • Electron Density: Experimental electron density

    • Vectors: Possible vectors for elaboration

Controlling the 3D viewer

Fragalysis uses NGL viewer under the hood to visualise 3D models, inspect binding sites, and compare multiple structures at once. It can be easily controlled with mouse and keyboard inputs:

Key / Mouse action

Effect

scroll

Zoom scene

scroll + Ctrl

Move near clipping plane

scroll + Shift

Move near clipping plane and far fog

scroll + Alt

Change isolevel of isosurfaces

drag right

Pan / translate scene

drag middle

Zoom scene

drag left

Rotate scene

drag + Shift + right

Zoom scene

drag left + right

Zoom scene

drag + Ctrl + right

Pan / translate hovered component

drag + Ctrl + left

Rotate hovered component

click pick (middle)

Auto view picked component element

hover pick

Show tooltip for hovered component element

i

Toggle stage spinning

k

Toggle stage rocking

p

Pause all stage animations

r

Reset stage auto view

Geometric filtering

Geometric filtering allows you to limit hits based on their position in 3D space. When you click any structure in the NGL Viewer, a green semi-transparent sphere will appear. After clicking Apply in the Radius selection dialog, only hits that intersect with the sphere will be shown in the hit navigator.

lhs

This feature is ON by default and can be toggled in the Advanced Search dialog. If you turn it off, any existing geometric filtering is cleared and no spatial filtering will be applied.

lhs

Browsing experimental data (LHS)

The left-hand-side (LHS) user interface of Fragalysis allows you to select experimental data for display, download, and computation.

There are three panels Tag Details, Hit Navigator, and Snapshot:

Tag Details

Tags are user-defined labels used to organise, filter, and annotate structures and fragments within Fragalysis. They provide a flexible way to group related data and share interpretation without modifying the underlying experimental data. This section details how tags can be used to filter experimental data. To add and edit tags see the curating experimental (LHS) data page.

All tags assigned to left-hand side data can be managed in this panel:

lhs

Select tags to show datasets assigned to that tag in the Hit navigator. The union / intersection toggle at the top of the page can be used to determine the behaviour when multiple tags are selected:

  • Union: Display datasets that are tagged with at least one of the selected tags

  • Intersection: Display datasets that are tagged with all of the selected tags

Control

Description

SHOW UNTAGGED HITS

Displays only datasets that do not have any tags assigned. Useful for identifying new or unreviewed fragment hits.

SHOW ALL HITS

Displays all datasets, ignoring the current tag selection. Overrides any active tag filters.

SELECT ALL TAGS

Selects all available tags in the tag list. Does not automatically select hits in the hit navigator.

SELECT HITS (per tag)

Activates selection checkboxes for all datasets associated with the chosen tag. Useful for bulk operations on tagged datasets.

Hit navigator

The Hit Navigator is the primary panel in Fragalysis for browsing, filtering, and selecting fragment screening hits associated with a target. It controls which datasets are loaded into the structure viewer and provides tools for organising hits using tags and metadata.

lhs

LHS “poses” and “observations”

Ligands from experimental datasets are known as observations are grouped into poses. Poses are named after and take their behaviour from their main observation.

Poses and observations have the following interface:

lhs

Display data in the 3D viewer

lhs

The [A][L][P][C][S][D][V] buttons can be used to activate various representations of the observation in the 3D NGL viewer:

  • [A]: shortcut to activate [L][P][C]

  • [L]: Ligand (ball and stick)

  • [P]: Protein sidechains (lines)

  • [C]: InteraCtions

  • [S]: Surface (coloured by electrostatics)

  • [D]: Electron Density maps

  • [V]: Expansion Vectors

What decides which hits are shown?

The hit navigator follows these key rules to determine which datasets are shown:

  1. All datasets displayed in the 3d viewer are shown

  2. Datasets from active tags are shown

  3. If a search string is present, only observations matching that string are shown

Searching for specific hits

The search bar can be used to search for specific hits:

lhs

Your search query will be matched against the following fields:

  • Observation shortcode

  • Compound aliases

  • Compound ID

This can be customised by clicking on the magnifying glass icon.

See also Creating direct URLs to specific views

Snapshots

Snapshots are saved views of the current analysis state, allowing you to quickly return to a specific set of selected hits and visualisation settings, and making it easy to share or revisit a particular analysis.

snapshot window

Creating direct URLs to specific views

To link to specific datasets within a target, the following syntax is supported:

Specifying the target and proposal

The following URL takes you to the target with:

  • name: A71EV2A

  • target access string (tas): lb32627-66:

https://fragalysis.diamond.ac.uk/viewer/react/preview/target/A71EV2A/tas/lb32627-66

Using the direct URL syntax

You can also create URLs that display specific datasets. To use this functionality you have to use this base URL including the direct command:

https://fragalysis.diamond.ac.uk/viewer/react/preview/direct/

Examples

e.g. showing observations with ligands where compound alias contains substring ASAP:

https://fragalysis.diamond.ac.uk/viewer/react/preview/direct/target/A71EV2A/tas/lb32627-66/compound/ASAP/L

  • target/A71EV2A: specifies the target name

  • tas/lb32627-66: specifies the target access string

  • compound/ASAP/L: shows the ligand (L) representation for all compound aliases containing ASAP

e.g. showing observations with ligands where compound alias is exactly ASAP-0016733-001:

https://fragalysis.diamond.ac.uk/viewer/react/preview/direct/target/A71EV2A/tas/lb32627-66/compound/ASAP-0016733-001/L/exact

  • target/A71EV2A: specifies the target name

  • tas/lb32627-66: specifies the target access string

  • compound/ASAP-0016733-001/L/exact: shows the ligand (L) representation for exact compound aliases match ASAP-0016733-001


Curating experimental data (LHS)

XChemAlign transforms crystallographic data into a biological reference frame. This involves matching ligand neighbourhoods across crystalforms, assemblies, and chains onto a appropriate reference structures. This process generates various sites which make their way into Fragalysis via tags. All tag information is also included in the metadata.csv in any Fragalysis download.

xca_transformation_figure

Working with XCA-assigned sites/tags

The main XCA-assigned site is called a Canonical Site and effectively clusters ligands. When you first upload a target to Fragalysis all CanonSite tags will have an auto-generated name and colour. See CanonSite 2: 2 - c0692/D/304/1 in the screenshot below. The yellow 2 in the hit navigator corresponds to this canonical site.

lhs_canonsite_highlight

Editing tags

The colour, name, and visibility of the site can be changed in the Edit Tags dialog.

edit_tags

Note

N.B. you must be logged in and on the UAS proposal to make any changes to a Fragalysis target.

Changing site assignments

If you disagree with the clustering from XCA you can change the assigned sites via the observation dialog (click on the observation count to open):

change_xca_site

Adding “Curator” tags

For additional annotation of structures Curator Tags can be created via the Edit Tags window. Select Other -- new tag -- from the dropdown. These tags can be added to observations in the tag navigator:

add_tag

Indicating merging hypotheses

For Fast Forward Fragments it is required to create one Curator Tag for each group of fragments that you wish to explore merging. These can often just be all hits in the pockets of interest.

Indicating experimental / model quality

The experimental / model quality can be indicated using the traffic light system:

traffic_lights

Each observation will have a Main Status, it should be decided in your project who has the final say on this, typically there is one main data owner / structural biologist. All other members are recommended to only add Peer Reviews. These not only have a status, but also allow for a comment.


Uploading assay measurements or computed scores (LHS)

Fragalysis supports annotation of experimental data with text or numeric scores that are linked either to compound codes or observation short codes.

Warning

Do not upload any assay data to a public target that is confidential! Measurements against compounds that do not (yet) have structures will still be accessible to authorised API users.

Creating the assay data CSV

Create a CSV with:

  • one identifier column, containing either compound codes or observation short codes

  • as many text/numeric columns as you want

  • The data type of columns can optionally be specified by an additional row containing text, int, or float

assay_data_csv

Please note that CDD data can be exported as a CSV and often uploading with minimal manual modification.

Uploading

  • Log in and open your target of interest

  • Select Assay data upload from the menu

  • Complete the form:

assay_data_form

Modifying data type of existing data column

Use the /api/activity_data_curation/ endpoint to change data types of previously uploaded scores


Browsing virtual compound sets (RHS)

Overview of the RHS interface

The right hand side (RHS) is where follow-up designs and their virtual hits are navigated. Follow-up designs are grouped into compound sets, corresponding to each SDF that was uploaded (See Uploading compound sets to the RHS).

Inspirations

The F button on each compound can be used to bring up a modal with the experimental hits used as inspirations / references for the compound design. The same LHS visualisation buttons are available to superimpose the inspiration hit with the follow-up design. When an experimental dataset is displayed, all virtual designs referencing that ligand will have their F icon active.

rhs

Sorting and filtering

Clicking on the filter filter icon allows you to sort and filter the compounds by properties present in the uploaded set. Typically you will find scores such as energy_score representing computed binding energy, distance_score representing RMS distance to the fragment inspirations, and score_inspiration which may indicate how well the fragments references have been recapitulated:

rhs

Curating virtual compound sets (RHS)

Colours / painting

You can paint compounds with colours that can be renamed, i.e. “Yes”, “No”, “Maybe”:

lhs

These labels will be assigned to compounds in your session and can be downloaded as a CSV in the “selected compounds” tab

Warning

The state of the Fragalyis RHS does not persist when you refresh or otherwise leave the page. To export a copy of your curations remember to download a CSV:

lhs

Arrows

Use these arrows to quickly apply the current visualisations to adjacent compounds

This works best when inspirations modal is open, and the inspiration hits and current compound are shown as ligands

Exporting curations

Once you have painted compounds you can export a CSV which can be used to share your curations/review with others:

lhs

Compounds from different sets they can be viewed together in the “selected compounds” tab.


Uploading virtual compound sets (RHS)

In order to disseminate non-experimental structures/ligands with Fragalysis, they can be uploaded using the “RHS upload” option in the “Hamburger menu”, which takes you to the viewer/upload_cset endpoint:

lhs

Supported data format

To upload a compound set to the RHS of Fragalysis an SD file (SDF) must be prepared.

Header molecule

Fragalysis requires a header molecule that defines properties for the whole compound set. The molecule and coordinates of the header molecule are completely ignored, however there are required properties:

Property

Value

_Name

ver_1.2

ref_url

Reference URL for the algorithm / dataset

submitter_name

Compound set submitter’s name

submitter_email

Compound set submitter’s email

submitter_institution

Compound set submitter’s institution

generation_date

Date associated with the data (ISO 8601)

method

Algorithm / method name for this compound set

Additionally, if you want to include extra text or numerical properties for ligands in this set you will have to include that property in the header as well with a description value. For example if you want to include a energy_score property with each ligand you will need to include this as a property on the header as well, with a text description:

Property

Value

energy_score

Computed binding energy (kcal/mol)

An example header molecule is provided below.

Ligands

The required properties for each non-header molecule are different:

Property

Value

_Name

compound name

ref_pdb

Reference protein (Fragalysis observation short-code, e.g. A0310a)

ref_mols

Reference datasets that inspired this molecule/pose (Fragalysis observation short-code, e.g. A0310a)

Ligands and proteins (SDF + ZIP of PDBs)

If you have computed custom protein conformations associated with these ligands they can be provided in the upload form as a separate ZIP archive. In this case, your ref_pdb values for each ligand should be the name of the relevant PDB file.

Example Header

ver_1.2
     RDKit          3D

 14 15  0  0  0  0  0  0  0  0999 V2000
   -3.4503    1.0190   -1.1743 C   0  0  0  0  0  0  0  0  0  0  0  0
   -2.2533    1.0671   -0.5344 N   0  0  0  0  0  0  0  0  0  0  0  0
   -2.1679   -0.0620    0.1865 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.3036   -0.8455    1.1366 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.4390   -1.7388    0.2452 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.3763   -0.9521   -0.6603 N   0  0  0  0  0  0  0  0  0  0  0  0
    1.4334   -0.1564   -0.1409 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.0843   -0.7615    0.8099 N   0  0  0  0  0  0  0  0  0  0  0  0
    3.2028   -0.1250    1.4766 C   0  0  0  0  0  0  0  0  0  0  0  0
    4.1795    0.3255    0.4069 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.7544    1.5811   -0.2821 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.9712    1.4810   -0.5890 S   0  0  0  0  0  0  0  0  0  0  0  0
   -3.3092   -0.7524   -0.0399 N   0  0  0  0  0  0  0  0  0  0  0  0
   -4.0785   -0.0801   -0.8706 O   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  2  0
  2  3  1  0
  3  4  1  0
  4  5  1  0
  5  6  1  0
  6  7  1  0
  7  8  2  0
  8  9  1  0
  9 10  1  0
 10 11  1  0
 11 12  1  0
  3 13  2  0
 13 14  1  0
 14  1  1  0
 12  7  1  0
M  END
>  <ref_url>  (1) 
https://github.com/mwinokan/BulkDock

>  <submitter_name>  (1) 
Max Winokan

>  <submitter_email>  (1) 
max.winokan@diamond.ac.uk

>  <submitter_institution>  (1) 
DLS

>  <generation_date>  (1) 
2024-12-02

>  <method>  (1) 
Knitwork_CavB_impure

>  <SLURM_JOB_ID>  (1) 
SLURM_JOB_ID

>  <SLURM_JOB_NAME>  (1) 
SLURM_JOB_NAME

>  <csv_name>  (1) 
csv_name

>  <scratch_subdir>  (1) 
scratch_subdir

>  <fragmenstein_runtime>  (1) 
fragmenstein_runtime

>  <fragmenstein_outcome>  (1) 
fragmenstein_outcome

>  <fragmenstein_mode>  (1) 
fragmenstein_mode

>  <fragmenstein_error>  (1) 
fragmenstein_error

>  <exports>  (1) 
exports

>  <HIPPO Pose ID>  (1) 
HIPPO Pose ID

>  <HIPPO Compound ID>  (1) 
HIPPO Compound ID

>  <smiles>  (1) 
smiles

>  <ref_pdb>  (1) 
protein reference

>  <ref_mols>  (1) 
fragment inspirations

>  <original ID>  (1) 
original ID

>  <compound inchikey>  (1) 
compound inchikey