Felix Macro Generator Command Help
Tools Home Molecular Biology NMR Data Model VENN Links
               
UCHC>SBF>TOOLS_HOME>HELP>NMR>FELIX_GENERATOR
Table of Contents
Introduction
Dimension
Data Filename
Matrix Filename
Number of Data Points
Number of Matrix Points & Zero Filling
Time Domain Convolution (Solvent Suppression)
Linear Prediction
First Apodization (Window Function)
Second Apodization
Reverse Matrix
Fourier Transformation
Phasing
Referencing Parameters
Cut Acquisition Dimension in Half
Reference
Plot Spectrum

Introduction: (top)
The Felix macro generator is designed to aid in the creation of Felix macros for the processing of Varian 2- and 3-dimensional NMR data collected using States or States-TPPI for quadrature detection. In principle, the macro generator should work with Bruker data sets with only minor adjustments. The biggest difference will be in the names of parameters located on the form itself.

The macro generator is still under development. While I have attempted to eliminate as many bugs as possible I can not test all possible scenerios. If you find any bugs in the macro generator please let me know and I will attempt to fix them as soon as possible. The program at this point does some very basic error checking when executed. If any of the parameters you have selected are incompatible an error message should appear with suggestions on how to fix the problem. However, the checks are not rigorous and there are sure to be cases where incorrectly entered parameters may not be caught by the error checking. If you find any additional error checks you would like included please let me know.

At this point the macro generator still lacks some features that I hope to incorporate when I get time. At some point I plan on adding features to process TPPI data, perform baseline correction, process sensitivity-enhanced data without having to rearrange the data beforehand with a utility such as gradsort_p2p, add additional window function choices, add more choices for solvent subtraction, and any other features that people suggest. At this point I have written a script that generates pictures for each window function you are using. This script can be run by selecting the view window function link from the NMR Tools page. Shortly, it will be incorporated as a button on the macro generator page itself. Also, I am currently developing an nmrPipe macro generator that will produce both the conversion and processing scripts needed to use the nmrPipe processing program.
Dimension: (top)
Toggle to switch between 2-dimensional and 3-dimensional data. If the spectra is 2-dimensional only the first two columns need to be filled in (direct dimension and f1:first indirect), for 3-dimensional data all three columns need to be entered (direct dimension, f1:first indirect, and f2:second indirect).
Data Filename: (top)
Filename of the input data set. Data must be complex, in the correct Felix format, and for cases where the data was collected with sensitivity enhancement, the data must be reorganized into the format of a conventional data set. See me if there is any question of whether your data was collected using sensitivity enhancement. In general, any spectra that detects amide resonances in the direct dimension will be sensitivity enhanced while most others will be non-sensitivity enhanced. However, this is not always true.

Filename Format - Felix will not allow capital letters or spaces in either the data filename or matrix filename. Also, for security reasons only letters, numbers and underscore are allowed in valid filenames. If you need to use other character types simply use a text editor to rename them after the macro is generated.

Felix non-sensitivity enhanced data - For non-sensitivity enhanced data use the var2flx command to convert from Varian to Felix data formats. Go to the directory where the experimental directory resides. Type var2flx (Enter). Enter the name of the experimental directory (Ex: hsqc.fid) and hit enter. Next enter the name of the Felix formatted data file, generally with a .dat extension (Ex: hsqc.dat) and hit enter. Lastly, enter 0 when asked for byte swap and hit enter. A new file will be created with the filename that you entered (in the above example hsqc.dat).

Felix sensitivity enhanced data - For sensitivity enhanced data use the v2f_grad command to perform both a rearrangement of the data set to a conventional format and a conversion to the Felix format in a single step. Usage: v2f_grad experimental_directory_without_extension ni ni2 ni3 np. (Ex: v2f_grad hsqc 128 0 0 2048) Note that no .fid extension is used in the Varian filename and that zeros are entered for ni2 and ni3. For a 3-dimensional data set ni2 would be non-zero and for a 4-dimensional spectra ni2 and ni3 would both be non-zero. Also note that the final Felix formatted file will have the same filename as inputed with a .dat extension. During the rearrangement process the original Varian data file called fid is renamed to fid.var and a new temporary file called fid is created. The temporary file, fid, is removed at the end of the process leaving the original Varian file with a new name fid.var. If the rearrangement process needs to be redone the file Fidgrad needs to be deleted and the file fid.var must be renamed to fid before proceeding with the v2f_grad command again.

nmrPipe data - Procedures for converting nmrPipe data can be obtained from the Data Filename link from the nmrPipe macro generator page.
Matrix Name: (top)
Name of the final transformed data set. When the Felix macro is executed from within Felix the program tests to see if a filename exists with the same name as the matrix name that you entered. If it does it is deleted. Afterwards the matrix file is created and the processed data is stored inside this file. Near the top of the macro the matrix file size is displayed. Make sure you will have enough disk space to create the matrix file before executing the macro.

Matrix Name Format - Felix will not allow capital letters or spaces in either the data filename or matrix filename. Also, for security reasons only letters, numbers and underscore are allowed in valid filenames. If you need to use other character types simply use a text editor to rename them after the macro is generated.
Number of Data Points: (top)
np - Number of points in each free induction decay (fid). Each fid consists of 1/2 real and 1/2 imaginary points. For example, if np = 1024 then there are 512 real and 512 imaginary points. Window functions should be applied to the number of real points or less (<= 1/2*np and not np).

ni - Number of increments collected in the second dimension. ni is listed as a complex number. Therefore, if ni = 128 there are 128 real and 128 imaginary points in the second dimension. In this example there would be 256 fids to process in the f1 (D2) dimension.

ni2 - Number of increments collected in the third dimension. Only used when processing 3-dimensional data sets. ni2, like ni, is listed as a complex number.
Number of Matrix Points & Zero-Filling: (top)
Zero filling extends an fid by appending zeros to the end. This causes a slight increase in the digital resolution of the frequency domain data after fourier transformation and allows imaginary data to be reconstructed using a Hilbert fourier transformation. In multidimensional NMR data sets the imaginary data is often discarded to save space after phasing. In order to rephase at a later time the imaginary data will need to be regenerated. It can be shown mathematically that this can only be done properly if the data was zero-filled once. A single zero fill will double the number of points in a fid by appending zeros at the end. It is typical to apply a single zero-fill when transforming NMR data. Any additional zero-filling will generally not improve the resolution any but may have some cosmetic appeal (smoother looking data). The zero-fill command should be applied after application of a window function or if used after apodization care must be taken to ensure that the window function is applied only to the actual data points that were collected and not to any of the zeros added by zero-filling.

Often the tail end of an fid does not equal zero because the entire fid is either shifted up or down slightly from the zero point. This is referred to as a DC offset. If an fid with a DC offset is fourier transformed a spike at zero frequency will appear. Worse, if the fid is zero-filled, the appended zeros will not extend from the last point for the fid but rather will be offset from the last point. This will be interpreted as a truncation artifact when performing a fourier transformation and cause wiggles at the base of peaks. Both of these adverse effects can be removed by simply adjusting the fid up or down so that the center of the fid is near zero. This feature is automatically added in the acquisition dimension when using the macro generator. If this is an option that you would like to control please let me know.

D1 - The size, in real points, of the acquisition dimension (f3) after fourier transformation. The value must be a fourier number and must be at least equal to the number of real points (1/2*np). The one exception to this is if the acquisition dimension is cut in half (see below). In this case D1 can be as small as 1/2 the number of real points (1/4*np).

D1 Zero filling - Zero filling is determined by the size of D1 relative to the number of real points in the fid (1/2*np). If D1 is set larger than the number of real data points the remainder of the points are padded with zeros before fourier transformation. The amount of zero-filling that is performed is illustrated with the following examples. If np = 1024, D1 = 1024 and half = 'n' then there would be 512 real points and a single zero fill would be performed to pad the fid with 512 zeros to a final size of 1024 set by D1. Using the above example except half = 'y', two zero fills would be performed. The 512 real points would be padded with 1,536 points to bring the total of real points to 2048. After fourier transformation the 1024 points on the right half of the processed data set will be deleted leaving 1024 points, which is equal to D1.

D2 - The size, in real points, of the second dimension (f1) after fourier transformation. The value must be a fourier number and must be at least equal to the total number of points in the D2 dimension (real + imaginary). The total number of points is equal to 2*ni. Therefore, if ni = 128, there are 128 real points + 128 imaginary points for a total of 256 points and D2 must be at least 256.

D2 Zero filling - Zero filling in D2 (f1) is determined by the value of D2 relative to ni. If D2 = 2*ni, its minimum value, then a single zero fill will be performed assuming no linear prediction (see below). If D2 = 4*ni then two zero fills will be performed. Note that it is impossible to not perform at least a single zero fill in D2 (f1) unless you are performing linear prediction (see below).

D3 - The size, in real points, of the third dimension (f2) after fourier transformation. This value only needs to be set for 3-dimensional experiments. The value must be a fourier number and must be at least equal to the total number of points in the D3 dimension (real + imaginary). The total number of points is equal to 2*ni2. Therefore, if ni2 = 32, there are 32 real points + 32 imaginary points for a total of 64 points and D3 must be at least 64.

D3 Zero filling - Zero filling in D3 (f2) is determined by the value of D3 relative to ni2. If D3 = 2*ni2, its minimum value, then a single zero fill will be performed assuming no linear prediction (see below). If D3 = 4*ni2 then two zero fills will be performed. Note that it is impossible to not perform at least a single zero fill in D3 (f2) unless you are performing linear prediction (see below).

Matrix Size - The final matrix size can be determined for a 2-dimensional experiment by multiplying D1*D2*4, and for a 3-dimensional experiment by multiplying D1*D2*D3*4. While it is a good idea to perform zero filling to increase the digital resolution of 3-dimensional experiments there are limits determined by the final data set size that you will want to work with. It is typical to try and keep the final data sets to around 64 Megabytes in size to make analysis easier (too large a file will slow down screen drawing considerably). Some experiments, such as the 13C-edited noesy-hsqc and the HCCH-TOCSY are typically processed to have a 128 Megabyte final size.
Time Domain Convolution: (top)
The time domain convolution is a very effective method to remove huge solvent signals, such as residual water, from your spectra. The method performs a convolution of the fid using a window size that you set with either a sinebell or gaussian function to locate the signal with the lowest frequency in your spectra. Therefore, in order for this to work the solvent signal to be removed must be on-resonance (the center of the spectra), because the frequency at the center of the spectra is zero. Because the fid has a finite size and the widow width is greater than a single point data points before and after the fid must be determined. These can either be determined by averaging the slopes at the tails of the fid or by linear predicting the values. Note that is is possible to remove solvent or buffer signals that are not at the center of the spectra as well. To do this you will need to perform a circular shift of the data set to place the signal you want removed at the center, remove the signal, and then circular shift the data back to its original location. See me if this is something you are interested in.

Function - I have found very little difference between using a sinebell or gaussian function. I would try both and see which one works best for your situation.

Window size - The window size is an empirical value that is dependent on the linewidth of the signal you are removing and the number of points in the fid. The smaller the value the greater the amount of signal that is subtracted and the faster the calculation time. For experiments where no signals overlap the solvent signal you generally want to use small values for size, such as 20. For experiments where you have closely spaced resonances to the solvent signal it is generally best to try larger values (~60) and to test several different window sizes to get the best results of subtracting solvent and leaving your signals alone.

Extrapolation - The average tails method is faster than the linear prediction method, but generally does not do as good a job. Just as for the window size it is best to try both methods and choose the one that works best if you have closely spaced peaks to the solvent signal. If you are simply getting rid of the water signal from an HSQC or other experiment where there is no overlap of your signals with the solvent then it probably does no matter much which extrapolation method you use.
Linear Prediction: (top)
Linear prediction extrapolates additional data points to time-domain data (fid). Linear predicting can be an effective way to increase the number of data points, and hence resolution, for data sets that are truncated. Often in 3-dimensional data sets you set ni and ni2 small to save acquisition time even though the signal has not decayed away to zero. Using linear prediction to extend the fid in these cases can improve the resolution considerably.

Linear prediction should only be used when the signal you are trying to predict has not decayed completely away to zero. If the signal has already decayed to zero then linear predicting further data points will generally add noise and not improve resolution. It is therefore best to always process data without any linear prediction and then compare the spectra to one with linear prediction. It is also best to try different linear prediction parameters and compare them to see which works the best. Remember processing parameters can have huge effects on the quality of the data.

For experiments that have dimensions that were collected with constant time evolution it is generally best to use mirror-image linear prediction. See the readme file of the pulse sequence or ask me if you are unsure if any of the dimensions were collected with linear prediction. In general the mirror-image linear prediction algorithm will give superior results and is much faster to perform.

Linear prediction works best when the signal is strong, truncated, and there are as few peaks as possible to predict. Because of this last feature it is best in 3-dimensional data sets where both the f1 (D2) and f2 (D3) are to be linear predicted to fourier transform the acquisition dimension (D1) first, then transform the f1 (D2) dimension without linear prediction and then process the f2 (D3) dimension with linear prediction. Afterwards the f1 (D2) dimension can be inverse fourier transformed, linear predicted, and then re-transformed. All of this is built into the macro generator and takes no extra work on your part.

Linear Prediction Type - The choices are none, forward, backward, forward-backward and mirror-image. In general backward should not be used to linear predict forward data points. For dimensions without constant time use either forward or forward-backward linear prediction. The forward-backward gives better results but takes twice as much time. For cases where the dimension was collected with constant time evolution it is best to use mirror-image linear prediction. However, forward or forward-backward linear prediction may be used on constant time data, but never use mirror-image linear prediction on non-constant time data.

Last - Is the number of the last data point to predict. This value should generally be around 1.5*ni or 2*ni. The larger the value of last the better the resolution will be but at the expense of extra noise. Like most processing parameters it is best to try different values to see which one works the best. For cases where the signal is weak or the truncation effect is minimal it is best to use a smaller value for last and for cases where there is plenty of signal and the truncation effect is large use a larger value for last. Note that for protein work we generally do not make last greater than 2*ni, but for small molecule or peptide work you may be able to make last quite a bit larger with beneficial results.

Coefficients - The number of coefficients determines the effectiveness of the linear prediction algorithm. The greater the number of coefficients the better the results (50% of the number of points is the maximum value), however, as the number of coefficients increases the processing time increases quite dramatically. I have found it best to use around 0.25*ni to 0.33*ni for D2 and 0.25*ni2 to 0.33*ni2 for D3 for best results.
First Apodization (Window Functions): (top)
Rarely does fourier transformation of the fid give rise to good quality spectra. There are often problems with the final result such as truncation artifacts, low signal to noise or limited resolution. Apodization is the process where the spectra is convoluted to achieve a more satisfactory lineshape. This is done by multiplying the fid by a time domain filter function. Two common functions are the sinebell and gaussian functions. The idea is to multiply the fid by a function so that it always decays away to zero at the end. This will remove truncation artifacts that will give rise to wiggles along the baseline near peaks. For fids that are severely truncated this can lead to noise that stretches across the entire spectra. Depending on the strength of the signal will determine the type of function that you will want to apply. Typically one tries to increase the resolution as far as possible while keeping noise to a minimum. For some spectra that are very noisy the only thing that can be done is to decrease the noise at the expense of resolution. The initial few points of an fid are responsible for most of the signal to noise that you get. The stronger the initial part of the fid the weaker the noise will appear. The longer the fid "rings out" the higher the resolution will be. Therefore increasing the initial part of the fid will lead to good signal to noise but poor resolution while enhancing the late parts of the fid will give better resolution but add noise. It is up to you to try and decide which function will give the most desirable effects. Often it is good to have two processed spectra, one with good signal to noise and one with good resolution. That way you can have the best of both worlds and will not have to compromise too much.

The macro generator has a button that allows each of the window functions to be viewed. It is a good idea that you always view the function that you are using to make sure you know what you are doing to your data. This is especially true of gaussian functions where minor adjustments of the parameters can lead to huge changes in the shape of the function. No single setting when processing the NMR data will have a greater effect on the quality of your spectra than window functions. It is therefore important that you try several different functions to find the one that gives the best possible results.

Function - The functions to choose from include none, gaussian, sinebell, and sinebell squared. I suggest trying each of them initially to find which ones give the best results. Sinebell squared functions are quite common, but gaussian functions are also used quite a bit, especially for the acquisition dimensions of NOESY spectra. Other functions can be added later if you like. The none options should generally only be used when you are applying EM with the second apodization.

Shift - Shift is used to determine the shift of the sinebell and sinebell squared functions. It is not used for the gaussian functions and can be ignored. The shift is entered in degrees. A shift of 90 gives a pure cosine function and a shift of 0 gives a pure sinebell function. Small values give increased resolution at the expense of extra noise and large values give good signal to noise at the expense of resolution.

lb - lb is the amount of linebroadening. Typically a negative linebroadening value is used. The value for lb is very important and dramatically determines the shape of the gaussian function. lb is only used with a gaussian window function and can be ignored for sinebell or sinebell squared functions.

gc - gc is the gaussian coefficient. Typically a value of 0.2 is used with a negative linebroadening (lb) to give the resonances a gaussian lineshape. Normally NMR resonances have lorentzian lineshapes which are very broad at the bottom. Gaussian lineshapes are much more attractive for NMR data as they have narrow tails near the bottom reducing spectral overlap. Both lb and gc are dependent on the sweep width and the number of points in the fid. Because of this it is important to view the gaussian function first before applying it to make sure that it doing what you think it is. gc is only used with a gaussian window function and can be ignored for sinebell or sinebell squared functions.

Size - Size represents the number of points that the window function will be applied to in the direct dimension (D1). Typically you apply the window function to all of the real points (1/2*np). However, in cases where np was set too high the size variable allows you to only select part of the fid for transformation. Lets say that np = 2048, giving 1024 real and 1024 imaginary points. When viewing the 1024 real points of the fid you realize that the signal has decayed away by point 256. If you process the data using all 1024 real points you get a large amount of noise. If you chop the fid off after real point 512 and transform it you will get a reduction in the noise level and not diminish resolution significantly, as long as the signal truly has decayed away before the point in which you chopped the data. There is no size value for either of the indirect dimensions because only in bizarre cases would you not want to use all of the points in the transformation. For the indirect dimension size is set automatically to ni (ni2), or in the case of linear prediction it is set equal to last.

Viewing Window Functions from the Macro Generator - Soon there will be a view button located from within the macro generator form page that will display the window function for each dimension based on the selected parameters. To do this now go to the View Window Function page located on the NMRTools page, fill in the proper information, and select View at the bottom of the page. Sorry for the hassle of filling out the form a second time to do this, but hopefully the problem will be resolved soon.

Viewing Window Functions in Felix from the Command Line - You can view a window function inside of Felix using simple command line parameters. After starting Felix on the command line enter the following commands:
def datype 1 (Sets the data type to complex)
def datsiz 1024 (Note: Choose a value that equals 0.5*np)
def swidth 8000 (Enter the appropriate sweep width)
set 1 (Makes all real values equal to 1)
dr (draws the result)
Enter window function (Ex: ss 1024 90) or (Ex: gm -2 0.2)
dr (draws the result)
Repeat from the set 1 command to try additional window functions.

Viewing Window Functions in Felix from the Menu Bar - You can view window functions in Felix in real-time using simple dialog boxes. After starting Felix go to Process --> Window Function, Select the window function type, and then select Real-Time as the Method. You can adjust all of the various parameters with simple mouse clicks.
It is best to define the data size, data type and sweep width before viewing window functions. This can be done by reading in one of the fids you are processing or by using the command line as described above in Viewing Window Functions in Felix from the Command Line.
Second Apodization: (top)
This allows the fid to be multiplied by a second window function. Currently the only choice is exponential multiplication (em). EM multiplies the data in the work space by an exponential window. This apodization function is used to reduce noise at the expense of spectral resolution. EM may be used alone (by setting the 1st apodization to none) or in conjunction with other window functions. EM is dependent on the sweep width and number of points in the fid. Because of this always view the window function before transformation to be sure you know what you are applying. For instance if you are applying an em 5 to a fid with 1024 points it will be significantly different then applying em 5 to a fid with 256 points. Typically it is not beneficial to apply em unless the spectra is very noisy. In these cases it can be used quite effectively to help locate weak peaks hidden under the noise. However, this is done at the expense of resolution.

EM - No exponential multiplication is applied when em is set to 0. The larger the value of em the faster the exponential decay that is applied giving reduced noise but poorer resolution.
Reverse Matrix: (top)
Often the indirect dimensions of 2-dimensional and 3-dimensional NMR experiments are reversed. This can be fixed by changing the phase of the receiver during detection, but it is easier to reverse the fid during processing. One way to do this is to use the reverse matrix feature found in Felix after the data has been transformed. This works fine, but does leave a slight point error in the spectra that may cause different spectra to not overlap optimally. A better way to do this is by taking the complex conjugate of the fid before fourier transformation. The complex conjugate negates the imaginary part of the data in the fid causing a reflection the spectrum about zero frequency. If checked that particular dimension will be reversed during processing.
Fourier Transformation: (top)
When this box is checked a complex fast fourier transform will be applied to the time domain data (fids). By not fourier transforming a given dimension it is possible to load 1D vectors through the untransformed dimension allowing you to view the fids in either of the indirect dimensions. This can be useful for deciding if the signal in either the f1 (D2) or f2 (D3) dimension is truncated and can therefore be linear predicted. It is also useful for troubleshooting processing problems and for deciding if any hardware or temperature problems occurred during acquisition.
Phasing: (top)
Applies a phase correction of the frequency domain data. If both the zero order and first order phase correction values are zero no phase correction will be performed. It is typical to read in the first fid, phase it, and then use those phase values for the direct dimension. The f1 (D2) and f2 (D3) dimensions are generally rephased after fourier transformation if needed. However, in most cases very little or no phasing is needed in the indirect dimensions.

To determine phase parameters from the initial fid in Felix:
Using the Felix menus go to File --> Open, Select *.dat from the File Types Dialog, and select the appropriate filename.dat file listed in the right window.
Fourier transform the fid. It may help for viewing purposes to apply solvent suppression, a window function or zero-filling, however none of those should affect the phase parameters.
Note the phase0 and phase1 parameters and enter them into the appropriate locations in the macro generator form.
Referencing Parameters: (top)
Referencing information can be found from a print out of the acquisition parameters from the NMR spectrometer or by running the perl script procpar.prl on bambam.

Sweep Width - sw, sw1 and sw2 are the sweep widths for each of the three dimensions. They are needed if referencing is turned on or if either a gaussian or em window function is used.

Spectrometer Frequency sfrq1, sfrq2, and sfrq3 are the frequencies used in each of the three dimensions. The frequency used should be the frequency at zero ppm. This can be determine by running the macro setcar on the NMR host computer or from the procpar.prl script on bambam. These values are only needed if referencing is turned on.

Reference ref1, ref2, and ref3 are the reference ppm at the center of each of the three dimensions. This macro assumes that the reference value is set to the center of the spectra. These values only need to be input if referencing is turned on. Let me know if you would like to be able to select a reference point manually and I can edit the macro generator.

Nucleus The nucleus that is detected in each of the three dimensions. These values are for display purposes only and do not affect the referencing in any way. They are not used in referencing is turned off.
Cut Dimension in Half? (top)
For 15N edited spectra it is typical that only the amide resonances appear in the direct dimension (D1). Since all of the amide resonances are downfield of water, which is typically the center of the spectra, there is no need to keep the right half of the spectra. For these cases it is best to cut the direct dimension in half. This saves disk space by 50%, decreases processing time 4 fold for 3-dimensional spectra, and allows for faster screen drawing during analysis. In order to keep referencing information correct when cutting the dimension in half the value of sw is divided in half and the reference point of water is moved to the right hand edge of the spectra. This is all handled by the macro generator with no additional input from the user. At this point the cut dimension in half switch will only save the left half of the spectra. If there is a need I can edit the generator to allow selection of the right half of the spectra or allow the user to define a particular region to save. Let me know if either choice would be beneficial.
Reference Matrix? (top)
If this box is selected then the information input in the referencing boxes will be used to reference the spectra. Referencing the spectra or modification to the referencing parameters can easily be done post-processing if needed.
Plot Spectrum? (top)
If this box is selected the spectrum will be plotted as an intensity plot after the processing is completed. This involves opening the processed matrix, determining the baseline noise level to pick an appropriate level to make the plot, and then plotting the spectrum as an intensity plot. Both the positive and negative signals are plotted. For 3-dimensional spectra the first D1-D2 plane will be drawn and a dialog box will appear to allow selection of other D1-D2 planes. I prefer contour plots to intensity plots, but I chose the intensity plot because it draws much faster. Once the user is satisfied with the plot parameters and has cropped to an appropriate size they can switch to the contour plot mode.