Title: Online Data Processing
Master URL: http://www.iram.es/IRAMES/documents/projectDataProcOnline/
Revision: UNDER CONSTRUCTION, (needs review of section implementation and use cases), 1.1.4
Revision Date: 2002-06-14
[last rev: 2001-11-15]
Authors: W. Brunswig
Contributors: H. Ungerechts, A. Sievers
Audience: technical staff, astronomers
Publisher: IRAM, Granada, Spain
Keywords: data processing, data format, raw data, calibration, quick look
Description - about this document: This projects will implement online data processing under Unix. It will be one of the main
features of the new control system but we foresee to install part of it already also in the
current control system. This version of the document concentrates on the installation in the current
control system.
Pending
- check sections on implementation, installation, and user guide
- check abbaToFits: the names of the pipe directories have changed!
- add description on what shall be done for each observing mode
History of document
- v 1.0 2001-11-15 wb: draft of 1st version
- v 1.1 2002-06-03 wb: document release 1
Project Plan
- 2001-11 version 1.0: Project documentation, basis of installation
in the current control system, fix requirements of release 1, first ideas about design
- 2001-12: design architecture: DONE
- 2001-12: design of the script to calibrate OTF Maps: DONE
- 2002-05-29: release 1.0 installed and running: DONE
- 2002-07: fix requirements for release 2.
- 2002-10: install release 2.
Introduction
In the current MRT control system, antenna control is done on a
slow Vax/VMS system (iramea). For some backends, the data is recorded on iramea and
automatically processed at the end of subscans and scans (pointing fits, focus,
skydips, calibrations, ...) by process: RED.
However, due to the limited processing power
of the antenna control computer it is desirable to do all the automatic
data processing on a fast Unix workstation. In the future, we foresee that all
backends will send their data directly to the MRT file server.
The new control system of the 30m RT foresees the implementation of a subsystem
data processing. Any development of the current control system shall also
take the future system architecture into consideration and can serve as
a prototype for the design of the new control system.
Requirements
dataOnlineProcessing
[done,2002-05-29]:
(modified)
The data recorded by the 30mRT control system shall be processed under Unix
directly after the end of
subscans and scans. The processing done depends on the observing
procedure of the current scan and the receivers and backends used.
In release 1 the following observing modes will be calibrated:
OTFMap, PositionSwitch, FrequencySwitch, WobbleSwitch, RasterMaps.
Calibration scans will be analysed and the results (RxTemps,...) will
be send to the logger.
dataTransferIramEA
[done,2001-11-07]:
All header and raw-data files produced by the antenna control software
shall be transferred automatically to the MRT file server (curr. mrt-ux1)
at the end of each subscan.
dataTransferOthers
[done,2001-11-07]:
Some backends send data directly to the MRT file server:
4MHz, ABBA. This data has to be merged with the data files from the antenna
control system.
calibrateSpectra
[done,2002-05-29]:
(modified)
Data related to the heterodyne receivers shall be calibrated online under Unix.
The data processing software shall write the calibrated data (CLASS format) in
a file in a directory of the project.
quickLook
[done,2002-05-29]:
(modified)
For most observing modes it is required that results (normally in form of plots)
are displayed at the end of subscans. (This is also called "quick look".)
In release 1, we have impemented a simple quick look software.
All calibrated spectra are displayed one after the other, but the display software
does not allow (yet) to scroll through the plots. An implroved version of the plot
software is foreseen for release 2.
At the end of a scan, the data processing is different from the "quick look"
processing.
processCancelledScans
[fixed,2002-05-29]:
(modified)
Data from cancelled scans shall be processed as well as far as reasonable,
e.g. a pointing cancelled after subscan 2 could provide already the
azimuth pointing error.
In release 1, we do also some processing of cancelled scans, but we have to
check exactly what is done.
feedbackObs
[fixed,2001-11-07]:
The data processing software has to send results (pointing offsets) back to the
OBS program on iramea.
This feature will not be availabe in the 1st version.
logging
[done,2002-05-29]:
(modified)
The data processing software shall write results into general log
files (or into data bases in the future): calibration.log, pointing.log,
(?? what else ?? please let me know).
The observer shall have a window where this information is displayed.
In the NCS, this logging information will be send to the general
monitoring process. It shall be checked if this concept can also be used in the
current system. We also have to know, which software is accessing the current log
files.
plots
[fixed,2001-11-07]:
Result from the online processing can be displayed by plots, e.g. pointing,
focus, skydip, spectra (on-off). These plots shall be displayed to
the observer (also remote). The plots shall be written into files to be
displayed later. Information about the generated plots shall be kept also
to allow for fast access (browsing) of the plots of a project. The plots
shall be kept in a project directory.
processPointing
[open,2001-11-08]:
For each observing procedure we should write down the requirements.
This requirement is just an incomplete example.
The processing of pointing scans shall:
- at the end of each subscan display the recorded data, including a gauss fit
- at the end of a scan determine the pointing corrections for azimuth and elevation
- if the backends have been calibrated before, the data shall be plotted as temperatures
- process all receivers
RM also suggested to check if the data of several pointing scans could be added together.
This feature could be implemented in a later version.
processFocus
[open,2001-11-08]:
.. to be done ..
processSkidip
[open,2001-11-08]:
.. to be done ..
calibratePSwitch
[fixed,2002-04-30]:
(modified)
Calibrate symetric and asymetric position switch scans using the last
calibration scan. See details in
use-case calibratePSwitch
calibrateFSwitch
[fixed,2002-04-30]:
(modified)
Calibrate frequency switch scans using the last
calibration scan. See details in
use-case calibrateFSwitch
calibrateWSwitch
[fixed,2002-04-30]:
(modified)
Calibrate wobble switch scans using the last
calibration scan. See details in
use-case calibrateWSwitch
calibrateRasterMaps
[open,2002-04-30]:
(modified)
Calibrate raster map scans using the last
calibration scan. Details are still to be worked out.
calibrateOTFMaps
[fixed,2002-04-30]:
(modified)
Calibrate OTFMap scans using the last
calibration scan. In release 1 we will request that OTF maps start with a
reference subscan! See details in
use-case calibrateOTFMap
from e-mail HU, 2001-11-15:
Before a spectral-line on-the-fly (OTF) scan a valid CAL COLD
must have been taken.
This CAL COLD scan will be used to calibrate the OTF scan.
During a standard OTF scan:
1 the first subscan is taken on an off-source reference position (REF).
2 one or several OTF subscans are taken.
3 normally another subscan is taken at a reference position.
Optionally: continue with step 2.
Normally the last subscan is a REF;
but in general the sequence can also end with an OTF subscan.
The processing of spectral-line on-the-fly observations shall:
after the first subscan (REF): do nothing
after an OTF subscan that is not the last subscan: do nothing
after the n-th REF subscan (n > 1):
process all OTF subscans taken between the (n-1)-th and n-th
REF subscan, using both the (n-1)-th and the n-th REF subscan.
at the end of the OTF scan, if the last subscan was not a REF:
process all OTF subscans taken since the last REF subscan,
using (only) the last REF subscan.
after any group of OTF subscans has been procesed:
generate plots (TBD)
Notes:
------
Only "standard" OTF scans in the sense explained above shall be
processed correctly by ODP.
ODP requires that REF and OTF data are taken in the same scan.
Desiderata for future iterations:
---------------------------------
Support frequency-switched OTF.
Could consider using CAL COLD scans taken after the OTF scan and
processing of non-standard OTF scans.
In release 1, we ALSO produce intermediate results of OTF maps:
After each subscan on the source, we calibrate just this subscan and its
result is shown in the "quick look". The spectra are not included in the
standard class file however. See details in design.
abbaToFits
[done,2001-11-15]:
Data of bolometer observations with the Abba backend shall be transformed automatically to the FITS
format.
processWhatElse
[open,2001-11-08]:
.. to be done ..
Specifications
obsProcedure
[done,2001-11-07]:
The remote data processing software has to know the observing procedure
of a scan. This information has to be recorded in such a way that data
processing can also be done later (offline).
endOfScan
[done,2002-05-29]:
(modified)
The data processing software has to know if a scan has finished or not in
order to do "quick look" processing or endOfScan processing. E.g. for pointing
"quick look" just is means plotting the recorded data whereas at the end of
a pointing scan the pointing corrections are calculated.
workingDirectory
[fixed,2001-11-09]:
A scheme shall be setup to define a working directory for each post processing
task (e.g., otfcal, abbaToFits).
scriptCoding
[fixed,2001-11-09]:
Coding rules for the post processing scripts shall be defined. I assume
that these scripts will be written in SIC.
taskControl
[fixed,2001-11-09]:
The operator shall be able to restart the postprocessing tasks.
taskCrashes
[fixed,2001-11-09]:
If a postprocessing task (e.g. OTFCAL) crashes, the operator and observer
shall be informed and the task shall be restarted automatically.
Design
architectureOverview
[done,2002-05-29]:
(modified)
We have the following tasks:
- data-producer:
the current telescope control software produces so-called
raw data and header files: header files with specification of the
observation and one data file for each backend (see also
subsections dataTransfer...). All files are finally stored
on mrt-lx1.
- post-processing tasks (data-consumer):
After raw header and data files are produced, other task
will process these files. Possible tasks are:
- calibration of heterodyne data (odpCal)
- fits converters (odpAbbaToFits)
- data analysis of pointing, focus, skidip (odpPointing, ...)
- data plots (odpPlot)
The "communication" between producer and consumer is based on the
pipeline concept. Consumer task inform the producer that
they want to be informed about new data. It is up to the consumer
to decide what to do with the data. Detail are given below in subsection
"dataProcessingPipeline".
obsProcedure
[done,2001-11-07]:
The OBS program shall write the observing procedure of a scan into the scan header
using a new OBSINP command OBSP. The procedure can be up to 8 characters long.
I has to be checked if more information about the observing procedure
has to be provided by OBS.
dataTransferIramEA
[done,2001-11-07]:
In the current system, for most backends complete raw data files are
generated on iramea.
For these backends, all raw data generated is transferred automatically
to mrt-ux1. See also
project rawDataToUnix documentation.
dataTransferOthers
[done,2001-11-07]:
- 4MHz backend:
- the backend data is transferred directly from the backend
processor vbe4m to the file server mrt-ux1.
- The 4MHz backend processor also send UTC for all data dumps to the backend
process beorga on iramea.
- The antenna control software reserves space in the raw data file and
adds DAPs for the specific UTC time given.
- When this raw data file is transferred at the end of a subscan to the MRT
file server, the actual data is written into the raw data file
- ABBA:
- Raw data files are only written for ONE channel in wobbler and
skydip mode.
- In the fast-scanning mode, data from the continuum backend is recorded
every 250ms in order to have DAPs.
- At the end of a subscan:
- the raw data files are transferred
- data for all channels is retrieved from ABBA
- FITS files are generated from rowdata and ABBA files
endOfScan
[done,2002-05-29]:
(modified)
At the end of a scan, the telescope control software (current control system)
increases (mod 10000, no scan 0000 ?) the scan number. A utility checks
the scan number and generates a message (id=tcs:scanDone). The messages indicates
the finished scan and the last subscan number. The message is
send to a server program process: tcsLogServer.
The server program submits a "job" to the pipeline processes and then forwards
the message to the message logger.
The next section is not used for design, it is just kept for reference:
The antenna control software does not record the total number of subscans
in the header. The total number is defined by using 5 parameters (SRPs)
plus an option for each of these parameters. We are trying to calculate the
total number of subscan from these parameters. An alternative could be to
record the total number of subscans by as (as part of the OBSP parameter).
processCancelledScans
[open,2001-11-08]:
(According to JB Schraml) the preheader of the raw data files have an indication
if the subscan was cancelled. The header files are normaly written before the
end of a subscan (when the first backend is ready to be written) and therefore
cannot contain this type of information.
As a general rule for the NCS it should be foreseen that all subsystems can
write information before, during, and after a subscan (or whatever unit is choosen).
processingPipeline
[done,2002-05-29]:
(modified)
- The data-producer (see also subsection on "architectureOverview") will put raw data
of a new subscan into a directory "/mrt-lx1/mrt/data/"project"/r"date".
- After this, it will put a link to the header file of the new subscan in all
subdirectories that of /mrt-lx1/mrt/data/Pipe/raw/pODP.
- Each data processing task has its own subdirectory (e.g. /mrt-lx1/mrt/data/Pipe/raw/cal) and monitors this
directory for new links.
- If a task detects a new link, it does what it has to do and then removes the link.
Please note:
- data processing tasks can also "forward" links to other tasks, e.g.
after a calibration the link can be forwarded to a plot task.
- the raw data producer can code (e.g. in the file name) what class
of observation we have: heterodyne, hera, abba, ... . This can
help the consumer to decide if the link has to be processed or can be
ignored.
- The current pipeline concept only foresees consumer that monitor themself
their directory. In the future, we also foresee to have consumers that can
be woken up or that are started each time a jobs is available.
A tool shall display the state of the pipeline: which jobs are pending for how many seconds.
(This shall be a requirement of the general pipeline software.)
The 1st version of the pipeline will have this structure:
raw
pODP
abbaToFits
This means: raw is the generator of tasks in directories calibration, abbaToFits. Files and
links in these subdirectories are processed by processes odpCal, odpAbbaToFits.
The odp.. processes can produce new tasks in their subdirectories (or other directories).
calibration
[open,2002-05-29]:
(modified)
All data of heterodyne receivers will be processed by the calibration task.
The calibration tasks will prepare script:
- define SIC variables
- the observing procedure (odpProcedure)
- the project(odpProject)
- scan, subscan number(odpScan, odpSubscan)
- end of scan,0: no, 1:yes (odpEndOfScan)
- sky (0) or reference (>0) (odpSkyReference)
- calibration scan (1) or no (0) (odpCalibration)
- scan number of last calibration or -1 (odpCalibrationLast)
- backend number (1,2,3,4,or 7) (odpBackend)
- the directory where the raw data is (odpDirectory)
- the filename of the backend file (odpFileName)
- call a script with the name of the observing procedure
- wait for the end of the script WITH a TIMEOUT
In release 1, the scripts produced are not as described above. For release 2,
we shall either implement the design as described here or document here the actual
design.
Please check if more parameters are needed.
feedbackObs
[open,2001-11-07]:
We do not plan to install this feature in the first version.
OBS will have a global section and a second process (obsServer) shall map to this
memory area and run a server software that allows to set/get values of this memory
area. OBS uses values from this array to forward the results of the data processing
to the antenna control program (internally via OBSINP commands).
monitoring
[done,2002-05-29]:
(modified)
The data processing software writes a log record that is monitored on the
observer and operator screen (as part as the general monitoring of the MRT).
It displays the scan/subscan number, the procedure and possible results.
In release 1, the log messages are displayed directly. Release 2 will
produce a more readable output.
processCalibration
[done,2002-05-29]:
(modified)
AS will design a script for calibration.
Release 1 does not support TESTCAL observations. This will be done
in release 2.
processOTF
[done,2002-05-29]:
(modified)
from e-mail HU, 2001-11-15:
after the n-th REF subscan (n>1):
---------------------------------
otfcal is called with:
1 a pre-script that assigns the approprate values to the
following required SIC GLOBAL variables (all integers)
odp#backendCode ! backend 2 3 4 (5) or 7
odp#calScan ! CAL scan number
odp#flyScan ! OTF scan number
odp#refSubscan1 ! REF subscan before OTF subscans
odp#flySubscanList[1] ! OTF subscan
... ! more OTF subscans (otional)
odp#refSubscan2 ! REF subscan after OTF
2 a script that processes the OTF data as explained in the
requirements/specifications and according to the values
of the variables listed above.
3 a "post"-script. unused.
at the end of the OTF scan, if the last subscan was not a REF:
--------------------------------------------------------------
otfcal is called as above, except that:
1 odp#refSubscan2 = 0 ! must be set to 0 (zero)
generate plots (TBD)
--------------------
abbaToFits
[done,2001-11-15]:
A script abbaToFits will monitor directory /mrt-lx1/mrt/data/Pipe/raw/abbaToFits for
new links to raw data headers. If the observation used the bolometer with Abba it:
- will transfer the subscan data from the Abba backend
- and execute the abbaToFits program to generate a Fits file in the directory
/mrt-lx1/mrt/data/"project"/fits.
A future version of the script will also allow to do the transformation offline.
The name of the parameters will be modified in release 2 to use the same name for all
calibration scripts if possible (e.g.,use onScan instead of flyScan).
Implementation
obsProcedure
[done,2001-11-07]:
The OBS software code has been modified
and now writes the observing procedure into the header using Obsinp comand OBSP.
dataTransferOthers
[done,2001-11-07]:
- 4 MHz backend: the design is implemented. Currently, the 4MHz processor
writes data to files on the file-server via NFS. This data is merged with
the pseudo raw data files after the transfer to the file server as part
of the data transfer software.(source: /mrt-ux1/usr/local/bin/b4merge.icn)
- Abba: the process abbaToFits.py transfers ABBA data files via FTP and then
executes a merger program abbaToFits. (sources in /mrt-lx1/mrt/data/src): TO BE CHECKED
- Hera: TO BE CHECKED
processCalibration
[open,2001-11-07]:
the process odpCalibration (source /mrt-lx1/mrt/data/src/CalProc/pipeCal.icn) does:
- it checks if new links are in /mrt-lx1/mrt/data/Pipe/raw/pODP
- a script with the name of the observing procedure is executed
- the link it removed
The OTFCAL scripts are under development (by AS, HU).
We plan to replace the existing program by a python script.
dataProcessingAbbaToFits
[open,2001-11-07]:
To be done. See also Project Abba Control.
processOTF
[open,2001-11-15]:
from e-mail HU, 2001-11-15:
"odp#prescriptOTF.otf"
----------------------
Template script showing how to assign values to the required
variables. In this template the values are taken from the SIC
parameters &1 to &9; in a real case they should be set directly.
Up to 99 OTF subscans can be specified:
odp#flySubscanList[1] = i
...
odp#flySubscanList[99] = j
"odp#processOTF.otf"
--------------------
Process the OTF subscans as explained above:
For each OTF subscan in odp#flySubscanList (for each element of
odp#flySubscanList that is > 0) the scripts odp#prescript1OTF.otf and
odp#process1OTF.otf are called.
Scripts, Used Scripts, and Helper Scripts:
-----------------------------------------
odp#defineOTF.otf
ensures the correct definition of the SIC GLOBAL variables
odp#examineOTF.otf
makes a neat list of the SIC GLOBAL variables and their values
odp#plot1OTF.class
(pre-prototype, unused)
makes a plot
odp#prescript1OTF.otf
sets values for processing of 1 OTF subscan
odp#prescriptOTF.otf
sets values for processing of OTF subscans from one CRO cycle
odp#process1OTF.otf
processes 1 OTF subscan
the maximum number of OTF "dumps" is set to 1999 in this script;
maybe this should be determined by another variable ...
NOTE: THIS IS THE (ONLY) SCRIPT THAT DOES "REAL WORK".
SHOULD BE CHECKED AND TESTED ON MORE DATA.
odp#processOTF.otf
organizes processing of OTF subscans fromone CRO cycle
Implementation Notes:
---------------------
These scripts assume that a SIC logical "ODP:" is defined pointing to
the directory containing the scripts.
All names of these scripts, as well as the GLOBAL SIC variables they
use, start with the string "odp#".
Some overhead could be avoided by processing the CAL in
odp#processOTF.otf. In the current version the CAL is processed in
odp#process1OTF.otf for each OTF subscan (and each backend). For
development and test purposes this has the advantage that all the
"real work" is encapsulated in one script.
Note:
-----
The prototypes of these scripts are on mrt-ux1 in:
/users/astro/ungerech/ncs30m/onlineDataProcessing
abbaToFits
[done,2001-11-15]:
TBD
Installation
obsProcedure
[done,2001-11-07]:
A new OBS version has been installed as the default version on iramea
that records the observng procedure in the raw data header.
processCalibration
[open,2001-11-07]:
The process pipeCal has to be started during boot of the file server
meet2001-11-12
[done,2001-11-13]:
Meeting by AS, HU, WB: discussed concepts as written in "predraft" (2001-11-09).
HU suggested to:
- separate calibration and further processing (data plot, pointing, ...)
- start and stop OTFCAL for each subscan
- check if calibration of data could also be done with calibrations done after
an observation
AS mentions that data of focus, skidip, and calibration are not recorded in CLASS files.
We agreed on which steps to do next (see also section "plan").
User Guide (2002-06-06)
Operation
Start of odpCalibration process (2002-05-29)
The odpCalibration process is started during boot of
host: mrt-lx1 by script
script: /etc/rc.d/tcs.
Restart of odpCalibration process (2002-05-29)
This note on how to restart odpCalibration will be replaced soon by a script
the operator can execute.
In case of a failure of the odpCalibration process, the operator can restart the process on
host: mrt-lx1 under account root:
- Check if the process ist still running:
ps uxa | grep odpCal
Result (similar to this, numbers can be different)
mrt 3529 0.2 0.4 3384 2108 pts/1 S 09:39 0:06
/usr/bin/python ./odpCal.py --pipeDir /mrt-lx1/mrt/data/Pipe/raw/pODP/ --name odp
- If there is still such a process, stop it:
kill -9 3529 # replace 3529 with the current process number
- restart odpCal with command:
/etc/rc.d/tcs.d/odpCal
Observer
File with calibrated data (2002-06-06)
Calibrated spectra are written into
file: /mrt-lx1/mrt/data/(project-directory)/spectraOdp.30m .
During creation of projects we set up a link: data.30m
to this file in the default directory of the project. We also set up a
link: data to the
directory where project data (rawdata, fits files, ...) are stored. Access to the
data directory and file data.30m is readonly !
Check if these links are defined for your project: if not your project account
and execute:
ln -s /mrt-lx1/mrt/data/(project-directory)/spectraOdp.30m data.30m
ln -s /mrt-lx1/mrt/data/(project-directory)/ data
Files with quick look data (2002-05-29)
Intermediate spectra used for "quick look" are written into
file: /mrt-lx1/data/mrt/(project-directory)/spectraPlot.30m .
Monitor online data processing (2002-05-29)
On host: mrt-lx1 enter:
/mrt/tcs/tools/tcsMessageMonXterm odpCal
Plotting calibrated data (2002-05-29)
The plot utility will display quick look data and the final calibrated data.
On host: mrt-lx1 you can start the plot program by entering:
/mrt/tcs/tools/odpPlot
In order to kill the plot process, enter "jobs" to find out the number of the plot job
and enter "kill -9 n" with n being the job number.
List of Requirements/Specifications and Descendants
List of Hypertext references and swItems
-
file: name="/mrt-lx1/mrt/data/"project"/r"date" ->
-
file: name=/mrt-lx1/mrt/data/Pipe/raw/pODP ->
-
file: name=/mrt-lx1/mrt/data/Pipe/raw/abbaToFits ->
-
file: name=/mrt-lx1/mrt/data/"project"/fits ->
-
host: name=mrt-lx1 ->
-
host: name=iramea ->
-
host: name=mrt-ux1 ->
-
host: name=vbe4m ->
-
host: name=mrt-ux1 ->
-
host: name=iramea ->
-
host: name=iramea ->
-
process: name=beorga ->
-
ref: href=ucCalibratePSwitch.html text=use-case calibratePSwitch ->
-
ref: href=ucCalibrateFSwitch.html text=use-case calibrateFSwitch ->
-
ref: href=ucCalibrateWSwitch.html text=use-case calibrateWSwitch ->
-
ref: href=ucCalibrateOTFMap.html text=use-case calibrateOTFMap ->
-
ref: href=http://www.iram.es/IRAMES/documents/projectPipeline/ text=pipeline concept ->
-
ref: href=http://www.iram.es/IRAMES/documents/prRawDataToUnix/ text=project rawDataToUnix documentation ->
-
ref: href=http://www.iram.es/IRAMES/documents/projectAbbaControl text=Project Abba Control ->
-
swItem: type=process name=RED ->
-
swItem: type=process name=tcsLogServer host=mrt-lx1 ->
-
swItem: type=host name=mrt-lx1 ->
-
swItem: type=script name=/etc/rc.d/tcs ->
-
swItem: type=host name=mrt-lx1 ->
-
swItem: type=file name=/mrt-lx1/mrt/data/(project-directory)/spectraOdp.30m ->
-
swItem: type=link name=data.30m ->
-
swItem: type=link name=data ->
-
swItem: type=file name=/mrt-lx1/data/mrt/(project-directory)/spectraPlot.30m ->
-
swItem: type=host name=mrt-lx1 ->
-
swItem: type=host name=mrt-lx1 ->
2002-06-14Online Data Processing
(W. Brunswig)