Context Action Framework¶
This package provides a collection of common class definitions that feature extractors (e.g., in vision pipeline) and action predictors should use.
Introduction¶
The system is split into the following components:
Context
Action Blocks:
Cut Block
Lever Block
Move Block
Push Block
Turn Over Block
Vice Block
Vision Block
Context¶
The context is defined as the state of the system including the work-cell module that is being operated in, the positions of objects in the system, and the state of the robots and the modules.
These are all specified as enums in types.py.
The modules are:
vision
panda1
panda2
vice
cutter
The robots are:
panda1
panda2
The End effectors are:
soft hand
soft gripper
screwdriver
The Cameras are:
basler
realsense
The Labels of objects are:
hca
hca_empty
smoke_detector
smoke_detector_insides
smoke_detector_insides_empty
battery
pcb
internals
pcb_covered
plastic_clip
wires
screw
battery_covered
gap
The faces of the HCAs and smoke detectors are:
front
back
side 1
side 2
The actions are:
none
start
end
cut
lever
move
push
turn over
vision
vice
These are all specified as enums in types.py.
Device Names¶
HCA names:
kalo2
minol
kalo
techem
ecotron
heimer
caloric
exim
ista
qundis
enco
kundo
qundis2
Smoke detector names:
senys
fumonic
siemens
hekatron
kalo
fireangel
siemens2
zettler
honeywell
esser
The context action framework provides a function lookup_label_precise_name
to get the device name given the device number. See types.py.
ROS Detection Message Format¶
The context action framework provides the Detection
class, defined in types.py. There are two helper functions detections_to_ros
and detections_to_py
to convert a python Detection
object to a ROS Detection.msg
and from ROS message back to python object.
Each detection has the following attributes:
id (int): index in detections list
tracking_id (int): unique ID per label that is stable across frames.
label (Label): hca/smoke_detector/battery…
label_face (LabelFace/None): front/back/side1/side2
label_precise (str/None): 01/01.1/02…
label_precise_name (str/None): kalo/minal/fumonic/siemens/…
score (float): segmentation confidence
tf_px (Transform): transform in pixels
box_px (array 4x2): bounding box in pixels
obb_px (array 4x2): oriented bounding box in pixels
center_px (array 2): center coordinates in pixels
polygon_px (Polygon nx2): polygon segmentation in pixels
tf (Transform): transform in meters
box (array 4x3): bounding box in meters
obb (array 4x3): oriented bounding box in meters
center (array 3): center coordinates in meters
polygon (Polygon nx3): polygon segmentation in meters
obb_3d (array 8x3): oriented bounding box with depth in meters
parent_frame (str): ROS parent frame name
table_name (str/None): table name of detection location
tf_name (str): ROS transform name corresponding to published detection TF
some of the parameters are given both in pixel coordinates and in real-world coordinates. The pixel coordinates correspond to the pixels on the image. The Real-world coordinates are either w.r.t. the work surface bottom left corner, or to the camera.
The ID is the index of this detection in the list of detections.
The tracking ID is the ID given to the tracked detection that remains constant over multiple images, so long as the component is being tracked successfully.
The label is the class of the component or device.
The face label is the face of the device that is showing. It can be viewed as the discrete orientation of the device w.r.t. the camera. The possible faces are: front, back, side1, side2. \red{actually side1, side2, were used in the beginning of the project but not used much and therefore later examples in the data did not include this.}
The precise label is the classification of the HCA front/back and smoke detector front/back.
The score is the confidence score given by the segmentation network for the component.
The transform is the 6D position and rotation of the centre of the component.
The box is the bounding box of component.
The obb is the oriented bounding box of the component.
The center is the centre position of the component.
The polygon is the polygon mask of the component.
The OBB 3D is the 3D oriented bounding box of the component.
When depth estimation is not available, a hard-coded height value is used. This height value is dependent on the object, since we know that the height of HCAs lies between 25-30mm and the height of smoke detectors lies between 30-60mm.
Action Blocks¶
An action block is a high level specification of an operation that can be carried out on the Reconcycle cells. The action block can be a physical movement, an information extractor from the physical environment, or a combination of the two.
Action blocks are high level blocks and an action block can consist of multiple actions, for example, the cut block moves an object into the cutter, and then the cutter is activated to cut the object.
Cut Block¶
The cut block should specify the initial position of the object and the cutter module, where the object is to be cut.
enum from_module
Transform from_tf
enum to_module
Transform to_tf
array obb_3d
enum robot
int end_effector
bool success
Lever Block¶
The lever block should specify from where to where to carry out the levering action and with which end effector and robot.
enum module
Transform from_tf
Transform to_tf
array obb_3d
enum robot
enum end_effector
bool success
Move Block¶
The move block specifies the start and end positions of an object and which end effector and robot should do the moving.
enum from_module
Transform from_tf
enum to_module
Transform to_tf
array obb_3d
enum robot
enum end_effector
bool success
Push Block¶
The push block specifies the start and end positions of the pushing action and with which robot and end effector the push action should be carried out with.
enum module
Transform from_tf
Trnasform to_tf
array obb_3d
enum robot
enum end_effector
bool success
Turn Over Block¶
The turn over block specifies the position and 3d oriented bounding box of the object that should be picked up, rotated 180 degrees, and placed down again, with the specified robot and end effector.
enum module
Transform tf
array obb_3d
enum robot
enum end_effector
bool success
Vice Block¶
The vice block specifies whether the vice should clamp and turn over or only clamp, or only turn over.
enum module
bool clamp
bool turn_over
bool success
Vision Block¶
The vision block specifies whether gap detections should be carried out, which camera to use and above which module. Gap detection is only possible with the realsense camera and also only the realsense camera can be moved to a specified position.
The gap detection is useful for levering actions. The parts detection is useful for moving actions. All coordinates of parts are given in world coordinates with respect to the module.
The parts detection uses a neural network called Yolact for parts segmentation. It uses a kalman filter for tracking and reidentification.
The gap detection uses the depth image and a classical clusturing approach to determine gaps in the device.
The vision details are a list of detections and gaps (if gap detections were requested and available).
enum camera
enum module
transform tf
bool gap_detection
bool gap_detection
Detection[] detections
Gap[] gaps
A detection is defined as the whole or part of a device.
int id
int tracking_id
enum label
float score
Transform to_px
array box_px
array obb_px
array obb_3d_px
Transform tf
array box
array obb
array obb_3d
array polygon_px
Gap:
int id
Transform from_tf
Transform to_tf
float from_depth
float to_depth
array obb
array obb_3d