网上怎么卖东西流程_最新广告公司经营范围_google 网站推广_网站优化的方法有哪些

GR-ConvNet代码详解

文章目录

GR-ConvNet代码详解
前言
一、utils
- 1.dataset_processing
- - 1.image.py
  - - 1.Iamge类
    - 2.DepthImage类
    - 3.WidthImage类
  - 2.grasp.py
  - - 1. _gr_text_to_no()方法
    - 2.GraspRectangles类
    - 3.GraspRectangle类
    - 3.Grasp类
    - 4.detect_grasps方法
  - 3.generate_cornell_depth.py
  - 4.evaluation.py
- 2.data
- - 1.camera_data.py
  - 2.grasp_data.py
  - 3.cornell_data.py
  - 4.jacquard_data
- 3.visualisation
二、trained-models
三、inference
- 1.grasp_generator.py
- 2.post_process.py
- 3.models
- - grasp_model.py
四、hardware
- 1.calibrate_camera.py
- 2.camera.py
- 3.device.py
五、train_network.py
六、

前言

随着之前看完了GR-ConvNet网络的论文，接下来就开始学习其代码。
首先，其代码分为五个模块hardware,inference,trained-models,utils和训练模型。
在这里插入图片描述

一、utils

这个文件夹下都是比较常用的类和方法。
在这里插入图片描述

1.dataset_processing

在这里插入图片描述

1.image.py

在image文件下主要分为了三个类

1.Iamge类

在这个类中主要定义了一些便于图像操作的一些方法。

getattr方法：当访问Image类中不存在的属性时，它就会转化成访问image对象对应矩阵中的属性。例如当访问image.size时。

def __getattr__(self, attr):# 当访问self本身不具有的属性时，就会转化成arr去访问这个属性return getattr(self.img, attr)

from_file方法：通过文件的路径去创建一个Image对象

 def from_file(cls, fname):return cls(imread(fname))

其他还定义了一些旋转，裁剪，放缩等方法，注意+d的表示是在该图片的复制图片上进行操作。

2.DepthImage类

首先，该类继承了Image类，所以它可以调用父类的任何方法。

from_pcd：这个方法主要是使用cornell数据集中的点云数据生成对应的深度图像。点云数据具体格式可参照cornell数据集介绍。

def from_pcd(cls, pcd_filename, shape, default_filler=0, index=None):img = np.zeros(shape)if default_filler != 0:img += default_fillerwith open(pcd_filename) as f:for l in f.readlines():ls = l.split()if len(ls) != 5:# Not a point line in the file.continuetry:# Not a number, carry on.float(ls[0])except ValueError:continuei = int(ls[4]) # 求解当前点对应图像的x，y坐标r = i // shape[1]c = i % shape[1]if index is None:x = float(ls[0])y = float(ls[1])z = float(ls[2])img[r, c] = np.sqrt(x ** 2 + y ** 2 + z ** 2)else:img[r, c] = float(ls[index])return cls(img / 1000.0)

from_tiff方法：从tiff文件直接生成深度图像对象。

inpaint方法:对应论文中，在这里插入图片描述
我猜测可能由于点云数据中的点不够填充数据集的大小，所以使用这个方法将缺失的那些点置为0。

gradients方法：使用Sobel方法计算深度图像中的梯度，方便检测那些边缘值。

normalise方法：对图像标准化。

3.WidthImage类

对宽度图像裁剪和标准化。宽度图像描述了对应的每个像素所需夹臂宽度。

2.grasp.py

在这个文件下主要定义了三个类和两个方法。

1. _gr_text_to_no()方法

将从文件中读入了x，y。x，y表示抓取框四个点中的一个点。
这个方法将点的类型转化为int并且将去偏移量，以（y，x）的形式返回。

2.GraspRectangles类

这个类主要定义了一些对抓取矩形框进行操作的方法，用来表示多个抓取框。
iter方法，使得类对象可以用for循环来遍历。

getattr方法，如果在当前类找不到需要调用的方法，则转去GraspRectangle类中寻找对应的方法。

load_from_array方法，读入一个n*4*2的数组，表示当前有n个抓取框，依次将其装填进grs列表中。

   def load_from_array(cls, arr):"""Load grasp rectangles from numpy array.:param arr: Nx4x2 array, where each 4x2 array is the 4 corner pixels of a grasp rectangle.:return: GraspRectangles()"""grs = []for i in range(arr.shape[0]):grp = arr[i, :, :].squeeze() # 将第0维去掉if grp.max() == 0:breakelse:grs.append(GraspRectangle(grp))return cls(grs)

load_from_cornell_file方法，从cornell数据集中读取文件。
cornell数据集中每一行代表抓取框的一个点，所以依次读入即可。最后通过GraspRectangel类初始化对应的抓取框，并将其装填进grs列表中。

def load_from_cornell_file(cls, fname):"""Load grasp rectangles from a Cornell dataset grasp file.:param fname: Path to text file.:return: GraspRectangles()"""grs = []with open(fname) as f:while True:# Load 4 lines at a time, corners of bounding box.p0 = f.readline()if not p0:break  # EOFp1, p2, p3 = f.readline(), f.readline(), f.readline()try:gr = np.array([_gr_text_to_no(p0),_gr_text_to_no(p1),_gr_text_to_no(p2),_gr_text_to_no(p3)])grs.append(GraspRectangle(gr))except ValueError:# Some files contain weird values.continuereturn cls(grs)

load_from_jacquard_file方法，从Jacquard数据集中读入抓取框。
注意Jacquard数据集和cornell数据集的格式不同，其每一行代表对应抓取框的中心x，y，对应水平轴的偏移量θ，对应夹具的宽度w，h是什么我目前也未搞懂，猜测应该是对应抓取狂的高度。
依次遍历即可，注意他是通过Grasp类来初始化当前抓取框的，最终填充到列表grs中。

def load_from_jacquard_file(cls, fname, scale=1.0):"""Load grasp rectangles from a Jacquard dataset file.:param fname: Path to file.:param scale: Scale to apply (e.g. if resizing images):return: GraspRectangles()"""grs = []with open(fname) as f:for l in f:x, y, theta, w, h = [float(v) for v in l[:-1].split(';')]# index based on row, column (y,x), and the Jacquard dataset's angles are flipped around an axis.grs.append(Grasp(np.array([y, x]), -theta / 180.0 * np.pi, w, h).as_gr)grs = cls(grs)grs.scale(scale)return grs

draw方法，这个方法是通过得到的抓取框生成对应抓取质量，抓取角度，抓取宽度的真实值。
比如shape=224*224，首先初始化了三张图片，也就是三个矩阵。并且使其全部置为0。
然后通过compact_polygon_coords获取对应抓取矩形框的索引。
然后将三张图片包含在抓取框内的进行初始化。最终通过网络训练就可以得到和真实值类似的三张图片的输出。

   def draw(self, shape, position=True, angle=True, width=True):"""Plot all GraspRectangles as solid rectangles in a numpy array, e.g. as network training data.:param shape: output shape:param position: If True, Q output will be produced:param angle: If True, Angle output will be produced:param width: If True, Width output will be produced:return: Q, Angle, Width outputs (or None)"""if position:pos_out = np.zeros(shape)else:pos_out = Noneif angle:ang_out = np.zeros(shape)else:ang_out = Noneif width:width_out = np.zeros(shape)else:width_out = Nonefor gr in self.grs:rr, cc = gr.compact_polygon_coords(shape) # 计算当前图像中抓取框的索引if position:pos_out[rr, cc] = 1.0if angle:ang_out[rr, cc] = gr.angle if width:width_out[rr, cc] = gr.length # 这里的length代表对应夹具的宽度return pos_out, ang_out, width_out

剩余就是一些比较简单的复制，计算中心等方法。在此不过多赘述。

3.GraspRectangle类

这个类主要用于生成一个抓取框对象。

_init_方法，注意这里的points表示一个四行两列的矩阵，代表一个抓取框。

  def __init__(self, points):self.points = points

angel方法，计算当前抓取框对应水平线的旋转值。
这里的角度为正值表示图像由水平方向逆时针旋转多少度。

def angle(self):""":return: Angle of the grasp to the horizontal."""dx = self.points[1, 1] - self.points[0, 1]dy = self.points[1, 0] - self.points[0, 0]return (np.arctan2(-dy, dx) + np.pi / 2) % np.pi - np.pi / 2

as_grasp方法，由抓取框表示转化为x，y角度表示。Grasp()

polygon_coords方法，计算对应抓取框对应形状下像素的索引。

def polygon_coords(self, shape=None):""":param shape: Output Shape:return: Indices of pixels within the grasp rectangle polygon."""return polygon(self.points[:, 0], self.points[:, 1], shape)

compact_polygon_coords方法,计算其在夹具宽度/3条件下矩形框的对应像素的索引值。

def compact_polygon_coords(self, shape=None):""":param shape: Output shape:return: Indices of pixels within the centre thrid of the grasp rectangle."""return Grasp(self.center, self.angle, self.length / 3, self.width).as_gr.polygon_coords(shape)

通过这两个方法取得对应的索引，就可以计算iou值。

iou方法：计算iou值，注意只在角度偏移小于30°时计算。

def iou(self, gr, angle_threshold=np.pi / 6):# 对应论文中的评判标准2，如果角度偏移大于30°，则认为是抓取失败。if abs((self.angle - gr.angle + np.pi / 2) % np.pi - np.pi / 2) > angle_threshold:return 0rr1, cc1 = self.polygon_coords()rr2, cc2 = polygon(gr.points[:, 0], gr.points[:, 1])try:r_max = max(rr1.max(), rr2.max()) + 1c_max = max(cc1.max(), cc2.max()) + 1except:return 0canvas = np.zeros((r_max, c_max))canvas[rr1, cc1] += 1canvas[rr2, cc2] += 1union = np.sum(canvas > 0)if union == 0:return 0intersection = np.sum(canvas == 2)return intersection / union

接下来就是一些简单的裁剪方法，不过多赘述。

3.Grasp类

这个类主要将抓取框表示成了抓取中心，抓取角度，抓取得分。
通过Grasp类和GraspRectangle类的互相转化，即可实现论文中提到的将网络对应于图像的输出值映射成为其对应的抓取框表示。

as_gr方法：将网络输出的三个值映射成为其对应的抓取框表示。

max_iou方法，计算预测出矩形框的最大iou值。

4.detect_grasps方法

该方法通过网络输出的q，ang，width图像来创建对应的Grasp对象，方便后续的操作。
并且使用peak_local_max方法来搜索q图像中局部最大值。因为找到的局部最大值可能就是对应的抓取框的中心位置。

def detect_grasps(q_img, ang_img, width_img=None, no_grasps=1):"""Detect grasps in a network output.:param q_img: Q image network output:param ang_img: Angle image network output:param width_img: (optional) Width image network output:param no_grasps: Max number of grasps to return:return: list of Grasps"""local_max = peak_local_max(q_img, min_distance=20, threshold_abs=0.2, num_peaks=no_grasps)grasps = []for grasp_point_array in local_max:grasp_point = tuple(grasp_point_array)grasp_angle = ang_img[grasp_point]g = Grasp(grasp_point, grasp_angle)if width_img is not None:g.length = width_img[grasp_point]g.width = g.length / 2grasps.append(g)return grasps

3.generate_cornell_depth.py

用来调用之前DepthIamge类中的from_pcd方法来生成Cornell数据集的深度图像。

if __name__ == '__main__':parser = argparse.ArgumentParser(description='Generate depth images from Cornell PCD files.')parser.add_argument('path', type=str, help='Path to Cornell Grasping Dataset')args = parser.parse_args()pcds = glob.glob(os.path.join(args.path, '*', 'pcd*[0-9].txt'))pcds.sort()for pcd in pcds:di = DepthImage.from_pcd(pcd, (480, 640))di.inpaint()of_name = pcd.replace('.txt', 'd.tiff')print(of_name)imsave(of_name, di.img.astype(np.float32))

4.evaluation.py

plot_output方法：画出网络输出的质量，角度，宽度图像。

calculate_iou_match方法：计算iou是否超过25％。

细节：对于一个预测值，它的真实值可能有多种情况（比如对于一个物体有多个抓取）
只要最大的那个iou超过阈值，则说明抓取成功。

2.data

在这里插入图片描述

1.camera_data.py

这个文件主要定义了一个CameraData类。
这个类主要完成对相机数据的读取。
首先将数据读进来，默认输入图像尺寸（640480），最终输出图像尺寸224224（预处理匹配网络输入图像尺寸）。计算出其对应的左上，右下坐标。
然后依次获取其深度图像和RGB图像，再经由通道维将二者联合。

2.grasp_data.py

GraspDatasetBase类，用来加载数据集。
首先它继承了torch.utils.data.Dataset类，必须重写getitem和len方法。

class GraspDatasetBase(torch.utils.data.Dataset):"""An abstract dataset for training networks in a common format."""def __init__(self, output_size=224, include_depth=True, include_rgb=False, random_rotate=False,random_zoom=False, input_only=False):""":param output_size: Image output size in pixels (square):param include_depth: Whether depth image is included:param include_rgb: Whether RGB image is included:param random_rotate: Whether random rotations are applied:param random_zoom: Whether random zooms are applied:param input_only: Whether to return only the network input (no labels)"""self.output_size = output_sizeself.random_rotate = random_rotateself.random_zoom = random_zoomself.input_only = input_onlyself.include_depth = include_depthself.include_rgb = include_rgbself.grasp_files = []if include_depth is False and include_rgb is False:raise ValueError('At least one of Depth or RGB must be specified.')@staticmethoddef numpy_to_torch(s):if len(s.shape) == 2:return torch.from_numpy(np.expand_dims(s, 0).astype(np.float32))else:return torch.from_numpy(s.astype(np.float32))# 抽象方法，必须在子类中重写。# get_gtbb是对矩形框进行旋转和放缩操作def get_gtbb(self, idx, rot=0, zoom=1.0):raise NotImplementedError()def get_depth(self, idx, rot=0, zoom=1.0):raise NotImplementedError()def get_rgb(self, idx, rot=0, zoom=1.0):raise NotImplementedError()def __getitem__(self, idx):if self.random_rotate:rotations = [0, np.pi / 2, 2 * np.pi / 2, 3 * np.pi / 2]rot = random.choice(rotations) # 随机选取一个旋转值else:rot = 0.0if self.random_zoom:zoom_factor = np.random.uniform(0.5, 1.0)else:zoom_factor = 1.0# Load the depth imageif self.include_depth:depth_img = self.get_depth(idx, rot, zoom_factor)# Load the RGB imageif self.include_rgb:rgb_img = self.get_rgb(idx, rot, zoom_factor)# Load the grasps# 读取抓取矩形框，并对其进行旋转放缩操作bbs = self.get_gtbb(idx, rot, zoom_factor)# 通过抓取框进行初始化真实值的质量，角度，宽度图像。pos_img, ang_img, width_img = bbs.draw((self.output_size, self.output_size))width_img = np.clip(width_img, 0.0, self.output_size / 2) / (self.output_size / 2)if self.include_depth and self.include_rgb:x = self.numpy_to_torch(np.concatenate((np.expand_dims(depth_img, 0),rgb_img),0))elif self.include_depth:x = self.numpy_to_torch(depth_img)elif self.include_rgb:x = self.numpy_to_torch(rgb_img)pos = self.numpy_to_torch(pos_img)cos = self.numpy_to_torch(np.cos(2 * ang_img))sin = self.numpy_to_torch(np.sin(2 * ang_img))width = self.numpy_to_torch(width_img)# 返回了这些值，在训练时可与神经网络的输出值对应，来计算损失return x, (pos, cos, sin, width), idx, rot, zoom_factordef __len__(self):return len(self.grasp_files)

3.cornell_data.py

从康奈尔数据集中读取数据。

4.jacquard_data

从jacquard数据集中读取数据。

3.visualisation

对输入图像，网络输出图像的可视化。

二、trained-models

保存了之前训练好的网络模型

三、inference

主要包含了两个文件和models文件夹
在这里插入图片描述

1.grasp_generator.py

对应于论文中的转换。

# Convert camera to robot coordinates
camera2robot = self.cam_pose
target_position = np.dot(camera2robot[0:3, 0:3], target) + camera2robot[0:3, 3:]
target_position = target_position[0:3, 0]# Convert camera to robot angle
angle = np.asarray([0, 0, grasps[0].angle])
angle.shape = (3, 1)
target_angle = np.dot(camera2robot[0:3, 0:3], angle)

这个文件实现了一个抓取生成器（GraspGenerator）类，用于从深度相机获取图像数据，利用预先训练好的深度学习模型预测抓取姿态，并将抓取姿态从相机坐标系转换到机器人坐标系，最后保存抓取姿态数据并在可视化模式下绘制抓取姿态。

2.post_process.py

post_process_output方法:该函数用于对网络的原始输出进行后处理，包括将输出从 PyTorch 张量转换为 NumPy 数组、计算角度、进行高斯滤波处理。其目的是对网络输出进行优化和调整，以获得更准确和可用的结果，主要用于抓取姿态预测的后处理阶段。

3.models

在这里插入图片描述
models文件夹下主要包括了grasp_model文件和GRConvNet网络模块的模型架构。

grasp_model.py

主要叙述了计算损失的过程，采用L1损失函数，最终函数的返回格式需要注意一下。
在这里插入图片描述

四、hardware

在其中分别又分为了三个模块。

在这里插入图片描述

1.calibrate_camera.py

这段代码实现了一个使用 Intel RealSense 相机获取彩色图像和深度图像，并进行可视化显示的程序。它通过RealSenseCamera类封装了与相机连接、图像获取和可视化的功能。

2.camera.py

3.device.py

获取对应gpu，cpu

五、train_network.py

该文件主要是模型在两个标准数据集上的训练和验证。
分为validate方法，train方法，run方法。
注意trian和validate返回值是这样的一个数据。在这里插入图片描述

六、

以上就是对GRConv-Net论文的代码详解。
通过读取代码，明白了论文中的一些细致的处理，对论文有了更深的理解。
接下来要准备跑一跑代码！

网上怎么卖东西流程_最新广告公司经营范围_google 网站推广_网站优化的方法有哪些

GR-ConvNet代码详解

文章目录

前言

一、utils

1.dataset_processing

1.image.py

1.Iamge类

2.DepthImage类

3.WidthImage类

2.grasp.py

1. _gr_text_to_no()方法

2.GraspRectangles类

3.GraspRectangle类

3.Grasp类

4.detect_grasps方法

3.generate_cornell_depth.py

4.evaluation.py

2.data

1.camera_data.py

2.grasp_data.py

3.cornell_data.py

4.jacquard_data

3.visualisation

二、trained-models

三、inference

1.grasp_generator.py

2.post_process.py

3.models

grasp_model.py

四、hardware

1.calibrate_camera.py

2.camera.py

3.device.py

五、train_network.py

六、

最新新闻

热搜词