Import ccpn peak list to SPARKY/POKY for CHESPA

ampratt2 · 15 May 2024 17:37

Hi all,

Hoping this is the best category…I have been using ccpn to track chemical shifts of my titrations. I want to use the peak lists that I’ve generated from ccpn to import the H/N assignments to SPARKY/POKY for using the CHESPA/CHESCA framework. SPARKY does not seem to accept .nef files for this, nor do I want to import the BMRB file into SPARKY and track all the peaks again for over 50 spectra. Is there a script out there that converts the ccpn peak list to a .str file? Or another way that I can import my peak lists into the SPARKY platform?

Thanks in advance!
-Aerial M. Owens

varioustoxins · 16 May 2024 11:20

Hi

I would look at NEF-Pipelines it has support to import and export to POKY / Sparky files

Information is here

[

varioustoxins/NEF-Pipelines: Nef tools
github.com

](GitHub - varioustoxins/NEF-Pipelines: Nef tools)

If you run into problems don’t hesitate to send me an e-mail

Regards
Gary

Dr Gary S Thompson NMR Facility Manager
CCPN CoI & Working Group Member
Wellcome Trust Biomolecular NMR Facility
School of Biosciences, Division of Natural Sciences
University of Kent, Canterbury, Kent, England, CT2 7NZ

:01227 82 7117
: g.s.thompson@kent.ac.uk
orchid: ORCID

VickyH · 16 May 2024 14:12

Alternatively, here is a (relatively quick and dirty) macro which will export Sparky peak lists for all HSQC spectra in your project (i.e. spectra where you have set the Experiment Type to 15N HSQC/HMQC with ET).

In the final line of the code you can specify the file path for where all the peak lists will be saved.

The peak lists will be of the form:

Assignment w2 w1
N53H-N 8.142 114.311
M54H-N 8.474 120.49
K55H-N 8.347 121.413
L56H-N 8.488 124.722
G57H-N 8.288 109.328
Q58H-N 8.872 122.38
K59H-N 8.609 126.762
V60H-N 9.515 125.574

Vicky

from collections import defaultdict
import pandas as pd

chemCompCodesDict = {
'Ala': ('A', 'ALA', 'ALANINE', 'C3H7N1O2'),
'Cys': ('C', 'CYS', 'CYSTEINE', 'C3H7N1O2S1'),
'Asp': ('D', 'ASP', 'ASPARTIC ACID', 'C4H6N1O4'),
'Glu': ('E', 'GLU', 'GLUTAMIC ACID', 'C5H8N1O4'),
'Phe': ('F', 'PHE', 'PHENYLALANINE', 'C9H11N1O2'),
'Gly': ('G', 'GLY', 'GLYCINE', 'C2H5N1O2'),
'His': ('H', 'HIS', 'L-Histidine', 'C6H10N3O2'),
'Ile': ('I', 'ILE', 'ISOLEUCINE', 'C6H13N1O2'),
'Lys': ('K', 'LYS', 'LYSINE', 'C6H15N2O2'),
'Leu': ('L', 'LEU', 'LEUCINE', 'C6H13N1O2'),
'Met': ('M', 'MET', 'METHIONINE', 'C5H11N1O2S1'),
'Asn': ('N', 'ASN', 'ASPARAGINE', 'C4H8N2O3'),
'Pro': ('P', 'PRO', 'PROLINE', 'C5H9N1O2'),
'Gln': ('Q', 'GLN', 'GLUTAMINE', 'C5H10N2O3'),
'Arg': ('R', 'ARG', 'ARGININE', 'C6H15N4O2'),
'Ser': ('S', 'SER', 'SERINE', 'C3H7N1O3'),
'Thr': ('T', 'THR', 'THREONINE', 'C4H9N1O3'),
'Val': ('V', 'VAL', 'VALINE', 'C5H11N1O2'),
'Trp': ('W', 'TRP', 'TRYPTOPHAN', 'C11H12N2O2'),
'Tyr': ('Y', 'TYR', 'TYROSINE', 'C9H11N1O3'),
}
def convertResidueCode(residueName, inputCodeType='threeLetter', outputCodeType='oneLetter'):
    """
    This will convert the three-letter AA codes used in Analysis to the one-letter code used in Sparky
    :param inputCodeType: oneLetter, threeLetter, synonym, molFormula
    :type inputCodeType: str
    :return: the same residue with the new letter code/name
    :rtype: str
    """
    modes = ['oneLetter', 'threeLetter', 'synonym', 'molFormula'] # order as they come from ChemCom dictionary
    if inputCodeType not in modes or outputCodeType not in modes:
        print('Code type not recognised. It has to be one of: ', modes)
        return
    for k, v in chemCompCodesDict.items():
        dd = {i:j for i,j in zip(modes,v)}
        if residueName == dd.get(inputCodeType):
            return dd.get(outputCodeType)

for pl in project.peakLists:
    if pl.spectrum.experimentType == 'H[N]':
        sparkyDict = defaultdict(list)
        for pk in pl.peaks:
            label = ''
            resAdded = False
            for dim in pk.peakList.spectrum.dimensions:
                for na in pk.assignmentsByDimensions[dim-1]:
                    if resAdded is False:
                        if na.nmrResidue.residueType is not None:
                            resType = convertResidueCode(na.nmrResidue.residueType, inputCodeType='threeLetter', outputCodeType='oneLetter')
                            label = label+resType
                        if na.nmrResidue.sequenceCode is not None:
                            label = label+na.nmrResidue.sequenceCode
                            resAdded = True
                            label = label+na.name
                    else:
                        label = label+'-'+na.name
                if label == '':
                    label = '?-?'
                if dim == 1:
                    sparkyDict['w2'].append(round(pk.position[dim - 1], 3))
                elif dim == 2:
                    sparkyDict['w1'].append(round(pk.position[dim - 1], 3))
            sparkyDict['Assignment'].append(label)

        df = pd.DataFrame(sparkyDict)
        filename = pl.spectrum.name+'_sparky.txt'
        df.to_csv(path_or_buf='~/Documents/Temp/'+filename, sep=' ', columns=['Assignment', 'w2', 'w1'], index=False)

varioustoxins · 16 May 2024 14:39

Though please do have a go with NEF-Pipelines if you can, the whole idea of building up NEF-Pipelines is to produce well documented and tested software for these jobs and avoiding hacks which are more likely (in the long run) to have errors or gotchas ;-) Please do also let me know if you need any help reading data back from CHESCA though it looks like tyou most probably don’t need to…

regards
Gary

Dr Gary S Thompson NMR Facility Manager
CCPN CoI & Working Group Member
Wellcome Trust Biomolecular NMR Facility
School of Biosciences, Division of Natural Sciences
University of Kent, Canterbury, Kent, England, CT2 7NZ

:01227 82 7117
: g.s.thompson@kent.ac.uk
orchid: orcid.org/0000-0001-9399-7636

ampratt2 · 16 May 2024 18:37

Thank you Gary! I will certainly try this and possibly reply with any holdups or hangups.

-Aerial M. Owens

ampratt2 · 16 May 2024 18:38

Thanks for this Vicky! I will save this for an alternative as well. Appreciate you sharing your script.

-Aerial M. Owens