Quickstart VM¶

The Mouse Light Acquistion Pipeline virtual machine is a self-contained instance of the complete pipeline system. It contains two example projects that demonstrate pipeline processing and additional ones may be added. A VMWare version is available for download. For other virtual machine applications, please consult the application documentation regarding importing or converting VMWare VM instances.

Overview¶

Download and install VMWare Player, VirtualBox, or a similar application
Download the virtual machine and unzip the archive (https://janelia.figshare.com/projects/MouseLight_Acquisition_Pipeline_VM/63212)
Open the virtual machine in your host application and follow the instructions in the ReadMe included in the virtual machine archive or below

Using the Virtual Machine¶

Please Note: The project titled “Example” will operate direcly from this virtual machine image. However the project titled “Brain 5x5x5 Section” requires the input data to be downloaded to your host machine.

Open and run the virtual machine in VMWare Player (free for individual or non-commercial user), Workstation, Fusion, or equivalent
Log in to the virtual machine with user:pass mluser:pipeline
View and control the activity pipeline system within the virtual machine by double-clicking the ViewPipeline desktop shortcut to open the web user interface
View and control activity the pipeline system from your host machine or on your network by determining the virtual machine IP address (e.g., via ifconfig) and browsing to that http://<vmip>:6101. For exampe, it may be something like http://192.168.1.105:6101 on an internal network.

Operate the Example Project¶

The example project is a small subsection of actual data where the source images have been truncated to allow for reasonably fast processing of tiles (seconds/minutes) for a small number of tiles to be able to view and explore the different stages of a functional pipeline project as well as to fit the required data into the virtual machine image without increasing the size greatly.

By default the Example project is activated, however the individual data processing stages are not, Otherwise they may finish processing before you have a chance to explore the system. When you are ready, select the Stages panel from the left, select the Example project from the dropdown in the upper right of the panel and start the Line Fix stage. After a few moments tiles will being processing (generally only one or two at time depending on the stage, again due to keeping the requirements of the virtual machine - in this case memory and CPU - relatively low). The downstream stages can be turned on as well, for the full project to run, or you can manage stages on and off individually if you would like to follow tiles through the system.

Operate the 5x5x5 Section Project¶

The 5x5x5 project uses a subsection of actual data where the source images have not been altered in any way. However to run this project you must download the data files to your host machine and place them in a shared folder to the virtual machine.

Download the 5x5x5 project data
Place the data in a directory named pipeline somewhere on the host machine
In the virtual machine settings, add a shared folder by selecting that directory named pipeline

This will mount the data from the host machine in the virtual machine in a manner that is preconfigured in the 5x5x5 project. Note that you can mount data in other ways (see below), however the example projects are preconfigured to work with this particular naming and mounting scheme.

Processing Data from your Host Machine¶

Any folders added as a shared folder for the virtual machine will appear under /external/ to pipeline services in the virtual machine in and in their respective containers. For example, the 5x5x5 project above is pre-configured to read data from /external/pipeline/018-08-01_raw-5x5x5-tileid-13024 by sharing a directory named pipeline on the host with the 5x5x5 data (in a directory titled 018-08-01_raw-5x5x5-tileid-13024). If you were to also share a directory named testdata in the virtual machine settings, its contents would be accessible to the pipeline services as /external/testdata.

Frequently Asked Questions¶

Where is the output?¶

The output from the Example project is located in /data/pipeline-output/sample. There is an auto-generated directory for each stage labeled with the stage depth and name.

The output from the 5x5x5 project is in a directory named output in the same pipeline directory that was mounted with the input data.

These settings can be changed by changing the output location for each of the stages in the projects.

Why is the Classifer stage named “Classifer (test)” and use the task “Axon Test”?¶

The Example project uses modified input data that only contains 81 tiles and whose input tile images are reduced in size. This makes the project small enough to include in the virtual machine and provide a functional example.

As a result, the stage classifier also uses a modified ilastik project that does not do the actual classifier stage for real samples. The overall behavior of the stage, and the project is identical to using the actual classifier stage. For real sample processing, it is possible to duplicate the sample project and simply change the task used in the duplicated classifier stage to “Axon UInt16” (the real classifier task) rather than the “Axon Test” task.

How do I add more workers?¶

Clone or copy additional instance of the pipeline virtual machine
- Note that if you do a normal copy you may need to use the facilities in your virtual machine application (VMWare, or VirtualBox) to generate a new MAC address for the copy (the Clone command in VMWare does this automatically)
In any worker-only copy, modify the system to not launch the core pipeline services at startup
- sudo systemctl disable pipeline
Update the worker-only copy to use your original instance with the full pipeline services running
- In /data/pipeline-systems/pipeline-worker open options.sh in your editor of choice (e.g., nano is installed)
- Change PIPELINE_CORE_SERVICES_HOST and PIPELINE_API_HOST from 172.17.0.1 to the IP address or hostname of the original virtual machine running the full pipeline services (use ifconfig in the virtual machine or similar)

For a third or more worker, duplicate the worker-only instance and only the first step is necessary.

Can I reset the sample project and run it again?¶

Yes. On the Stages panel, select the first stage in the project (“Line Fix” for the sample project) in the stage table. With the stage selected, choose Completed from the drop down in the upper left the stage tile list below the stage table.

Choose “Resubmit All” from the upper right of the tile list. If you have made changes and some tiles have failed, also select Failed from the drop down in the upper left and “Resubmit All” again.

It will take a few minutes for the first stage tiles to revert to “Queued” status and for that status to propogate down through the rest of the stages.

A second option is to duplicate the project, which will create a second project and a full second set of stages. You will then need to adjust the root location for the project (and/or copy the input file to that location), and likely will want to change the stage output locations from their defaul (“copy” appended to the original).

How do view I logs for the services?¶

In /data/pipeline-systems/pipeline/pipeline-deploy-services the script ./pipeline-logs.sh will load a Docker container with access to the log storage. cd /var/log/pipeline to review logs from the pipeline-api, pipeline-scheduler, and pipelient-client services.