35t Joint Meeting 20141020
This meeting was called to have a discussion to help disseminate understanding of issues that pertain to online/offline interface w.r.t. the 35t detector. It was driven by some questions (some naive) that were distributed before.
Misc Questions
In no particular order, these are the questions pertaining to DAQ and its interface with off-line:
- what is the low level DAQ file format/encoding (ROOT? binary?)
- what schema does the content of DAQ files follow?
- where is a diagram showing all the parts of the DAQ data stream with their names (millislice, microblock, tick, "trigger", etc)?
- what is the start/stop criteria that defines the highest level "chunk" of data (e.g. a "trigger")?
- how does this "chunk" correspond to a trigger for each expected trigger criteria?
- what time-ordering is expected from data coming out of the DAQ, particularly between disparate sources (e.g. Wires/PDs)?
- are we going to implement a "world clock" with 1 us resolution?
- what unit of data "chunk" ("event") is required and desired for offline analysis?
- what offline analysis decisions/limitations can we, as a collaboration, be comfortable "baking in" to this choice of unit?
- how are DAQ data "chunks" (triggers) numbered? How are offline data "chunks" ("events") numbered?
- what is the mapping from the former to the latter?
- what changes to art and/or LArSoft are needed to accommodate the above answers?
Notes
BV
These notes were taken by BrettViren.
In overview, we started out following the list of questions but then branched out into more free form (and fruitful) discussion. The notes are a set of bullet points I scribbled down as we went and may not be fully coherent.
- Low-level file format will be a ROOT file containing ROOT TTree objects
- The schema of these objects will be a collection of binary data "blobs"
- C++ "overlay" classes are being developed which will handle the unpacking.
- The are in a library that depends on Art but otherwise independent of online and offline
- (followup to check: is it reasonable to make this library independent from Art?)
- An ArtDaq raw "DaqEvent" is a vector of ArtDaq "fragments", a fragment is a "blob" of data with some meta data, "blob" means experiment-defined packed data
- A fragment translates into 35t names as a millislice
- See DocDB #9677 for list of run modes from Giles
- See DocDB #9871 for issues pertaining to 35t raw object definitions from Jeff
- Blobs follow internal schema and have a version to indicate this
- We should put both this version and a timestamp in a location that is version-independent
- Q: (from Josh) Why is there a microslice? A: (from Giles) this assures all data streams are kept in lock-step
- Q: Why is there a milislice? A: see Jeff's slide #7
- A "trigger" is applied in ArtDaq. A better description would be to call this a "filter". Exactly one "trigger" or "filter" condition is in effect for an entire "run"
- Run mode 1&2 assume zero-suppression
- milliblock stores absolute timestamps. These are from 56 bit, 64MHz clock from the Nova time system and are relative to a GPS-backed time.
- Q: (Josh) what indexes from a Nova time at the start of a millislice to data after zero suppression. A: (details from Giles - but basically enough info is provided to let one "un-zero-suppress" and regain the original data but with 0 values for things below threshold).
- There will/may be additional computed values provided that give some parameterization of the pulses that survive zero supression
- millislice is a collection of microslices + control/header info.
- milliblock is a collection of millislices from the different data sources in the detector
- order is not assured but there is an index that allows ordering
- (to check: it may be desirable to reorder inside the to-be-written raw-data Art input module)
- Ends of millislices (thus milliblocks) will overlap by a few microslices. See Jeff's DocDB.
- Overlap is deterministic and may be perfectly removed either at the end of online or the begin of offline
- Duplication is driven by the need to keep ArtDaq a parallel, multi node process and the desire to retain efficiency over millislice boundaries. Various discussion:
- Make ArtDaq "remember" last millislice -> won't work due to parallelism
- Make ArtDaq nodes ask others for last millislice -> hair on fire
- Spend the money for a single, huge multi-core, shared memory ArtDaq machine -> Jeff estimates bandwidth of naive algorithm using un-zero-suppressed data is too large to concentrate into a single box. See back up slide in his DocDB refed above
- Q: How are streams joined: A: an aggregator process as part of ArtDaq
- Q: How do we know the run mode (and thus the "trigger"/"filter" condition). A: ultimately it comes from run control. It will be stored in both the RC DB and as part of select info copied to the offline DB. It will also be added to a special data stream with one object added to each output file.
- Revisited file boundaries and subruns. Need to revisit the email discussion. May use subrun boundaries if RunControl pause/resume is used or may equate subruns with file boundaries.
- Q: do we have milliblocks now in LArSoft? A: (Tom) no not in exact format but you can make things that sort of look like them now in terms of physics. (BV) in principle we should be able to put a process after LArSoft that emulates the electronics and can be fed into the exact ArtDaq application used online. Giles: that would be most helpful if we had that right now for 35t, we do have an emulator of sorts now but not backed by detector simulation.
- Q: what will be the run size? Unsure, anywhere from seconds to days.
- Q: what version of Art/ROOT? Unsure exactly but will try to freeze for the duration of the 3month run.
- Data estimate: Mark had 300MBps zero suppressed for 90 days giving 100TB, Jeff recently redid an estimate and got 100-150TB. Tom's 200 TB is not so high after all. This is dominated by muons. It also assumes good performance with zero suppression.