Hints and Remarks
Units
Although, strictly speaking, the pipeline is mostly agnostic to scale and units, we use SI units throughout the project and strongly recommend that you do, too. The medical field often uses millimeters instead of meters, but we have decided that using Meters makes it easier to keep everything consistent and to parameterize other values such as pressure, gravity or elasticity. Furthermore, it has the nice side-effect that - since organs are usually several centimeters in size - resulting outputs are often in the range of ~0.01 to ~0.1, which is well suited for our deep learning settings.
Random values and determinism
We recommend to keep the pipeline you build deterministic, as it makes debugging and reproducing results much easier. Since PipelineBlocks may be called in any random order, using python’s random.random() function (and similar) should be avoided.
Instead, each DataSample provides its own random number generator (seeded once at startup with the DataSample’s ID), which can be accessed through core.datasample.DataSample.random
. This is an instance of the random.Random()
class, so it can be used to sample random values using random.random()
, random.uniform()
etc.
Print Dataset Results
To analyze a dataset after creating it, try using the src/analyze.py script, like so:
python3 src/analyze.py --data_path path/to/my/dataset
See python3 src/analyze.py --help
for more options.