My name is Arash Abadpour and this is my story:
In the late nineties, I was finishing my undergraduate studies in electrical engineering. I received word that the biomechanics laboratory had acquired a frame-grabber and was looking for someone to help them code up the device into a gait analysis system. In the system that we eventually built, two camcorders captured a person on the catwalk, and our software produced a csv file that contained two sets of x-y trajectories for each retroreflective spherical marker that we had attached to a dozen of the subject’s joints. Later on, we added pedobarography (image-based foot pressure analysis) to the system, and the team performed a study on patients with diabetes. 3d modelling of hard and soft tissue using X-ray, CT, MRI, ultrasound and other types of medical images that we received as Dicom files was next on the roster in the lab.
Image processing was applied mathematics on images, and I started a master’s program on color image processing at the mathematics science department in 2003. This was an introduction to higher mathematical concepts such as linear spaces and principal component analysis (PCA). This latter one introduced me to watermarking, color transfer (to colored and grayscale images), and semantic segmentation of aerial images. More importantly, I realized that to arrive at high-value image processing algorithms, one needs to be comfortable with the underlying mathematical models.
By 2005, I had moved to Canada and had started my Ph.D. on optimization. I found it fascinating that a complex system can be modelled and analyzed mathematically to produce valuable insights. For my Ph.D. thesis, I carried out this type of analysis on a capacity maximization problem in a cellular system and presented a solution strategy. During that period, I also worked as a research assistant on video-on-demand network design and maintenance using a multi-layer fuzzy clustering model – more info in the journal publication.
I joined Epson Canada in 2009, where for six and a half years, I worked on commercial and industrial applications, including visual inspection, symbology detection, 3d object detection and pose estimation, camera calibration, 3d scan/display systems (using stereo, time-of-flight, and structured light projection), head-mounted displays (for augmented and virtual reality), and robotics (assembly and bin-picking). Multiple patents came from that work. I also expanded my work on fuzzy clustering and published several journal papers during this period.
In 2015 I joined Intellijoint Surgical (IJS). This was my first startup experience and a deep dive into cost optimization within the framework of an image processing system. IJS’s infrared monocular camera is an exquisite technology that allows for high-accuracy pose estimation in the operating room. I had the opportunity to work on the tracking system and its applications, including an in-vivo infrared laser scanner -— more info in the patents section.
I first participated in the development of a learning-enabled machine vision application in 2016 at Fio. An Android-based device ran our ML models to recognize rapid diagnostic tests (rdt) and read their results in the field. An rdt is a cassette, 5-10 centimeters (2-4 inches) long, on which one or multiple active membranes allow for rapid testing for diseases such as Malaria, HIV, Dengue, Zika, etc. Thousands of rdts in the market each require that a certain amount of saliva, blood, or urine is added to a particular membrane. Additional chemical solutions are also needed to be added to other membranes on the rdt. While rdts are manufactured in large numbers to facilitate massive cost-efficient deployments of disease diagnostics, the task of information-collection from such implementations mandates machines in the workflow.
When I joined Fio, the company had already field-tested its V-100 version and was working on its V-200 system. These handheld devices allowed for controlled imaging of rdts in the field. Our algorithm visually recognized the rdt and thus provided healthcare workers with info on the use of their particular rdt. Our system then assisted and monitored the healthcare worker as they processed the rdt and the patient. After the incubation time for the rdt was complete, we asked the healthcare worker to put the rdt back in the device drawer. At this time, we captured a final image that was processed by a convolutional neural network (ConvNet) to determine if the patient was positive/negative or whether the test had been soiled due to excessive blood deposition, for example. I led the development of this system in matlab, and we then machine-translated it to python.
As the precision and recall figures from the algo moved into more valuable zones, we also envisioned reducing the human-machine friction. We developed designs for an overhead system, composed of cameras and projectors – an active desk for the healthcare worker, in effect. This device tracked the healthcare worker’s hands as they moved rdts and other objects around and provided assistance through interactive objects projected on the desk -— more info in the patents section.
In 2019, Betterview enabled me to experiment with a nuclear network that consumes aerial images and other types of data relevant to the insurance industry to produce property insights. By this time, I had already spent two years on Coursera, and other MOOC resources and I was comfortable with python and its machinery for scientific programming (numpy, scipy, seaborn, pandas, etc.) and machine/deep learning (tensorflow, scikit-learn). At the same time, hands-on experience allowed me to develop sizable gpu-saving strategies through transfer learning and weight sharing at the train time and tensor sharing during inference. At a more personal level, during the same period, I worked on my cloud programming (aws/gcs) skills. Also, I acquired first-hand experience with efficient ground-truth and human annotation acquisition, enrichment, and management (mturk/dls/boutique).
Recently, both the literature and also the experiments that I have been involved in, have produced encouraging results, in terms of the precision, recall, and iou numbers that the state-of-the-art networks of convolutional neural networks are capable of producing. On a more conceptual level, I am fascinated by the fact that the ground-truth modules in these systems simultaneously ingest seemingly different types of information, including images, polygons, rois, scores, and text, and participate in the process of weaving all of that data into predictions that have market value and are produced at scale and significant margins.
In late 2018, I started a self-funded exploratory/educational project that aimed at developing a mobile robot that could recognize its operator and maneuver in its environment safely. As the robot was subsequently named, blue is a sphero rvr connected to an ad-hoc mesh network of ~40 raspberry pis and other linux machines. Blue perceives its environment through rgb images and dsm+imu (through an intel realsense stereo camera). The code for blue is in bash and python (including django). Blue uses aws s3/rds for file/data storage and aws sqs for communication.
Blue is a continuous machine learning system that manages the full circle of data acquisition, human annotation collection, classifier training, and inference through cooperation with the other nodes on the network.
— live view of the interactions in the network: kamangir.net/shamim
— more about blue: abadpour.com/projects/blue.
— last updated: 19 August 2020