Comparing peak tables

jrl · 21 July 2023 12:34

Hi,

Is there an easy way in assign to pick up differences between two peak tables? Essentially, I would like to take two peak lists and get the differences between them without having to stare at them endlessly.
I obviously can, e.g. export the assignments as strings and look for duplicates in Excel or some other tool but that is also quite cumbersome. I imagine this could be accomplished with 5 lines of code in Python.

Thanks,
Józef

VickyH · 21 July 2023 12:39

Hi Józef,

yes, the easiest thing would be to do a little script. What sort of differences are you interested in? Peaks (i.e. assignments) that appear in one list, but not the other?

Vicky

jrl · 21 July 2023 12:46

Yes, precisely. I just have series of spectra with different mixing types and different mixing types and I would like to see the peaks that appear in one list but not another. This would help me with:

See which peaks appear at longer mixing times.
Track any inconsitencies in assignment. This is easy with a few spectra but becomes increasingly difficult if you have 30 of them.
Thank you,
Józef

VickyH · 21 July 2023 13:02

So this bit of very simple code would print out (in the console) a list of peaks which only occur in the first peak list:

pl1 = get('PL:hsqc.1')
pl2 = get('PL:hsqc.2')

for pk1 in pl1.peaks:
    pkStatus = 0
    for pk2 in pl2.peaks:
        if pk1.assignedNmrAtoms == pk2.assignedNmrAtoms:
            pkStatus = 1
    if pkStatus == 0:
        print(pk1, pk1.assignedNmrAtoms)

However, if you are looking at a whole series of spectra, then in principle the Chemical Shift Mapping module ought to be able to give you this kind of information. I shall check…

Vicky

VickyH · 21 July 2023 13:26

Hi Józef,

looks like you probably could use the Chemical Shift Mapping Module to do something helpful here. I’ve done a little video for you which shows you how this might work.

Vicky

jrl · 21 July 2023 14:04

Hi Vicky,

You are a star! Thank you. That will save me loads of time.

Best,
Józef

jrl · 21 July 2023 18:55

Hi Vicky,

One more quick question. If I wanted to modify the script you provided to display only
pk1.assignedNmrAtoms where Assign F1 belongs to NmrChain ‘NC:chain1’ and Assign F2 to NmrChain ‘NC:chain2’ (or vice versa) what would be the syntax?

I have got it working by converting object to strings but there must be a more elegant way to do it.
My way is:

substrings = ["@chain1","@chain2"]
pl1 = get(‘PL:hsqc.1’)
for pk1 in pl1.peaks:
if all([x in str(pk1.assignedNmrAtoms) for x in substrings]):
print(pk1, pk1.assignedNmrAtoms)

Thanks,
Józef

LucaM · 21 July 2023 19:21

Hi,
one way to filter could be this:
for dimension 1 and chain A

assignedNmrAtomsDim1 = peak.getByDimensions('assignedNmrAtoms', [1])
filteredNmrAtomsDim1 = [na for ll in assignedNmrAtomsDim1 for na in ll if na.nmrResidue.nmrChain.name == 'A']

VickyH · 24 July 2023 11:04

Hi Józef,

I think I’m a little unsure what you’re trying to do. But hopefully Luca’s code might do what you are trying to do.

A bit of help with getting away from breaking things down into strings all the time:
I’ll often try things on the Python console and something like

pk = get('PK:hsqc.1.7')
pk.

followed by the tab button will give you a list of all the properties associated with a peak.

pk.a

followed by tab will give you lots of assignment related ones.

Unfortunately the whole peak assignment stuff is a bit involved because of the ability to have multiple assignments per peak dimension.

Luca uses

peak.getByDimensions('assignedNmrAtoms', [1])

which is a nice way to get hold of dimension-related properties such as assignments. So you can substitute ‘assignedNmrAtoms’ with something like ‘ppmPositions’ if you like. And the [1] refers to the dimension (note that it uses the dimension number, so 1 or 2 for a 2D, rather than starting at 0 as Python indexes usually would).

There is a similar

pk.getByAxisCodes('assignedNmrAtoms', ['N'])

method which can also be useful.

The other thing to remember is that you can get hold of lots things in a nested/extended way, so if you have an NmrAtom (e.g. from pk.assignedNmrAtoms), then you can get hold of individual bits of information with

na = get('NA:A.4.LYS.H)
na.name
na.nmrResidue.sequenceCode
na.nmrResidue.residueType
na.nmrResidue.nmrChain.name

etc.

Hope this helps,

Vicky

jrl · 24 July 2023 18:21

Hi Vicky,

Thank you for all the explanations. Yes, they are helpful.
Overall, I am just trying to streamline as much a possible working with dozens of peak lists containing distance restraints from experiments with different types of mixing and mixing times. To keep things organised I am trying to classify them as different types (it is a bit of a complicated system with different chains and complex spatial arrangement).

Thanks,
Józef

P.S. It is really nice to start writing some useful macros!