博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
gfx-rs/hal跨平台图形抽象库使用介绍
阅读量:6652 次
发布时间:2019-06-25

本文共 13034 字,大约阅读时间需要 43 分钟。

文档列表见:

是一个Rust编写的底层、跨平台图形抽象库,包含如下层或组件:

  • gfx-HAL
  • gfx-backend-
    • Metal
    • Vulkan
    • OpenGL,开发中,由于GL与下一代接口Vulkan差异过大,这个模块可能做不完
    • OpenGL ES,开发中,由于GL与下一代接口Vulkan差异过大,这个模块可能做不完
    • DirectX 11
    • DirectX 12
    • WebGL,开发中,由于GL与下一代接口Vulkan差异过大,这个模块可能做不完
  • gfx-warden

本文档只考虑master分支,对应HAL新接口,忽略pre-II老接口。 另外,只考虑用gfx-hal实现离线渲染和计算着色器功能,即,渲染到纹理和GPGPU。渲染到窗口及鼠标、键盘事件处理可参考gfx自带DEMO。

初始化具体图形库后端

gfx-hal接口近乎1:1仿造Vulkan接口,可以参考Vulkan各种教程,Vulkan的操作从Instance创建开始。

#[cfg(any(feature = "vulkan", feature = "dx12", feature = "metal"))]let instance = backend::Instance::create("name", 1 /* version */);复制代码

创建不同的设备和队列需要适配器满足不同的能力要求,下面逐一描述。

创建不同功能的设备

整体流程为backend::Instance::create() -> enumerate_adapters() -> open_with()

  • enumerate_adapters()是目前可用的Adapter列表,在这一步我们先选择支持指定队列功能要求的适配器。在下一步打开具体逻辑Device时直接返回true。
  • open_with()第一个参数count表示要打开的QueueFamily数量,目前移动设备通常只有一个GPU,此时传递1即可,我目前主观认为打开多个QueueFamily并不能提高整个App的图形性能。

配置完在macOS上运行可得到类似如下信息,配置细节参考后面内容。

AdapterInfo {  name: "Intel Iris Pro Graphics",   vendor: 0,   device: 0,   device_type: IntegratedGpu }Limits {   max_texture_size: 4096,   max_texel_elements: 16777216,   ... }Memory types: [  MemoryType { properties: DEVICE_LOCAL, heap_index: 0 },   MemoryType { properties: COHERENT | CPU_VISIBLE, heap_index: 1 },  ...]复制代码

只支持渲染的设备

渲染是图形设备存在的意义,故简单粗暴地取出第1个适配器进行后面操作。

let mut adapter = instance.enumerate_adapters().remove(0);let (mut device, mut queue_group) = adapter  .open_with::<_, Graphics>(1, |_family| true  .unwrap();复制代码

只支持计算的设备

考虑到低版本的OpenGL不支持Compute Shader,此时需要过滤,如果只编译Metal/Vulkan,和上面一样enumerate_adapters().remove(0)即可。相应地,open_with()作了调整。

let mut adapter = instance    .enumerate_adapters()    .into_iter()    .find(|a| {        a.queue_families            .iter()            .any(|family| family.supports_compute())    })    .expect("Failed to find a GPU with compute support!");let (mut device, mut queue_group) = adapter    .open_with::<_, Compute>(1, |_family| true)    .unwrap();    复制代码

同时支持渲染+计算的设备

同上,适配器过滤条件和open_with()都调整成同时满足渲染与计算要求。

let mut adapter = instance    .enumerate_adapters()    .into_iter()    .find(|a| {        a.queue_families            .iter()            .any(|family| family.supports_graphics() && family.supports_compute())    }).expect("Failed to find a GPU with graphics and compute support!");let (mut device, mut queue_group) = adapter    .open_with::<_, General>(1, |_family| true)    .unwrap();    复制代码

有了Device和QueueGroup可开始创建Image(可看作Vulkan版Texture)、Pipeline等资源。

创建资源

Buffer

创建Buffer

Buffer和Image本身并不存储数据,它们表达了存储数据要满足的条件,这些条件用于创建Memory。

let usage = buffer::Usage::TRANSFER_SRC | buffer::Usage::TRANSFER_DST;let unbound = device.create_buffer(required_size_in_bytes, usage).unwrap();复制代码
  • required_size_in_bytes:需要分配的内存大小,单位字节,在此只作为一个标志,实际创建存储空间操作由后面介绍的Memory实现。
  • usage:根据Buffer的实际用途进行配置,不同组合对性能影响较大,需注意。

销毁Buffer

device.destroy_buffer(buffer);复制代码

连接Buffer与Memory

buffer = device.bind_buffer_memory(&memory, 0, unbound).unwrap();复制代码

bind_buffer_memory后Buffer对象才拥有实际的存储空间,但是,数据还是存在Memory对象中。后续更新Buffer挂接的数据只需要映射Memory进行修改。

连接Buffer和Memory后,在macOS上通常输出如下信息,其中length = 256的256与前面输出的Limits某一项相关,具体内容视显卡而定:

buffer = Buffer { raw: 
label =
length = 256 cpuCacheMode = MTLCPUCacheModeDefaultCache storageMode = MTLStorageModeShared resourceOptions = MTLResourceCPUCacheModeDefaultCache MTLResourceStorageModeShared purgeableState = MTLPurgeableStateNonVolatile, range: 0..6, options: CPUCacheModeDefaultCache | StorageModeShared }memory = Memory { heap: Public(MemoryTypeId(1),
label =
length = 256 cpuCacheMode = MTLCPUCacheModeDefaultCache storageMode = MTLStorageModeShared resourceOptions = MTLResourceCPUCacheModeDefaultCache MTLResourceStorageModeShared purgeableState = MTLPurgeableStateNonVolatile), size: 256 }复制代码

创建BufferView

let format = Some(format::Format::Rg4Unorm);let size = data_source.len();let buffer_view = device.create_buffer_view(buffer, format, 0..size);复制代码

销毁BufferView

device.destroy_buffer_view(buffer_view);复制代码

Memory

Memory分配用于存储Buffer和Image所需数据的内存空间。

创建Memory

// A note about performance: Using CPU_VISIBLE memory is convenient because it can be// directly memory mapped and easily updated by the CPU, but it is very slow and so should// only be used for small pieces of data that need to be updated very frequently. For something like// a vertex buffer that may be much larger and should not change frequently, you should instead// use a DEVICE_LOCAL buffer that gets filled by copying data from a CPU_VISIBLE staging buffer.let upload_type = memory_types    .iter()    .enumerate()    .position(|(id, mem_type)| {        mem_req.type_mask & (1 << id) != 0 && mem_type.properties.contains(memory::Properties::CPU_VISIBLE)    })    .unwrap()    .into();let mem_req = device.get_buffer_requirements(&unbound);    let memory = device.allocate_memory(upload_type, mem_req.size).unwrap();复制代码

Memory写入

Memory的读写都要映射相应的Write/Reader,为了线程安全,需要手工加上合适的Fence。

let mut data_target = device.acquire_mapping_writer::
(&memory, 0..size).unwrap();data_target[0..data_source.len()].copy_from_slice(data_source);device.release_mapping_writer(data_target);复制代码

Memory读取

let reader = device.acquire_mapping_reader::
(&staging_memory, 0..staging_size).unwrap();println!("Times: {:?}", reader[0..numbers.len()].into_iter().map(|n| *n).collect::
>());device.release_mapping_reader(reader);复制代码

Image

创建Image

类似Buffer对象,Image对象本身也不存储实际的纹理数据。

let kind = image::Kind::D2(dims.width as image::Size, dims.height as image::Size,                           1/* Layer */, 1/* NumSamples */);let unbound = device    .create_image(        kind,        1,        ColorFormat::SELF,        image::Tiling::Optimal,        image::Usage::TRANSFER_DST | image::Usage::SAMPLED,        image::StorageFlags::empty(),    )    .unwrap();复制代码

同样,创建Image时指定的Usage也要根据Image的实际用途来组合,不合理的组合会降低性能。

销毁Image

device.destroy_image_view(image_view);复制代码

连接Image到Memory

let image = device.bind_image_memory(&memory, 0, unbound).unwrap();复制代码

创建ImageView

创建Sampler

组织绘制命令

创建Submission

hal-buffer创建、读写

buffer::Usage::TRANSFER_SRC | buffer::Usage::TRANSFER_DST,复制代码

功能、区别

let (staging_memory, staging_buffer, staging_size) = create_buffer::
( &mut device, &memory_properties.memory_types, memory::Properties::CPU_VISIBLE | memory::Properties::COHERENT, buffer::Usage::TRANSFER_SRC | buffer::Usage::TRANSFER_DST, stride, numbers.len() as u64, );复制代码
let (device_memory, device_buffer, _device_buffer_size) = create_buffer::
( &mut device, &memory_properties.memory_types, memory::Properties::DEVICE_LOCAL, buffer::Usage::TRANSFER_SRC | buffer::Usage::TRANSFER_DST | buffer::Usage::STORAGE, stride, numbers.len() as u64, );复制代码
{        let mut writer = device.acquire_mapping_writer::
(&staging_memory, 0..staging_size).unwrap(); writer[0..numbers.len()].copy_from_slice(&numbers); device.release_mapping_writer(writer); }复制代码

Metal模块

fn create_buffer(    &self, size: u64, usage: buffer::Usage) -> Result
{ debug!("create_buffer of size {} and usage {:?}", size, usage); Ok(n::UnboundBuffer { size, usage, })}复制代码
fn get_buffer_requirements(&self, buffer: &n::UnboundBuffer) -> memory::Requirements {    let mut max_size = buffer.size;    let mut max_alignment = self.private_caps.buffer_alignment;    if self.private_caps.resource_heaps {        // We don't know what memory type the user will try to allocate the buffer with,         // so we test them all get the most stringent ones.        for (i, _mt) in self.memory_types.iter().enumerate() {            let (storage, cache) = MemoryTypes::describe(i);            let options = conv::resource_options_from_storage_and_cache(storage, cache);            let requirements = self.shared.device.lock()                .heap_buffer_size_and_align(buffer.size, options);            max_size = cmp::max(max_size, requirements.size);            max_alignment = cmp::max(max_alignment, requirements.align);        }    }    // based on Metal validation error for view creation:    // failed assertion `BytesPerRow of a buffer-backed texture with pixelFormat(XXX) must be aligned to 256 bytes    const SIZE_MASK: u64 = 0xFF;    let supports_texel_view = buffer.usage.intersects(        buffer::Usage::UNIFORM_TEXEL | buffer::Usage::STORAGE_TEXEL    );    memory::Requirements {        size: (max_size + SIZE_MASK) & !SIZE_MASK,        alignment: max_alignment,        type_mask: if !supports_texel_view || self.private_caps.shared_textures {            MemoryTypes::all().bits()        } else {            (MemoryTypes::all() ^ MemoryTypes::SHARED).bits()        },    }}复制代码
fn allocate_memory(&self, memory_type: hal::MemoryTypeId, size: u64) -> Result
{ let (storage, cache) = MemoryTypes::describe(memory_type.0); let device = self.shared.device.lock(); debug!("allocate_memory type {:?} of size {}", memory_type, size); // Heaps cannot be used for CPU coherent resources //TEMP: MacOS supports Private only, iOS and tvOS can do private/shared let heap = if self.private_caps.resource_heaps && storage != MTLStorageMode::Shared && false { let descriptor = metal::HeapDescriptor::new(); descriptor.set_storage_mode(storage); descriptor.set_cpu_cache_mode(cache); descriptor.set_size(size); let heap_raw = device.new_heap(&descriptor); n::MemoryHeap::Native(heap_raw) } else if storage == MTLStorageMode::Private { n::MemoryHeap::Private } else { let options = conv::resource_options_from_storage_and_cache(storage, cache); let cpu_buffer = device.new_buffer(size, options); debug!("\tbacked by cpu buffer {:?}", cpu_buffer.as_ptr()); n::MemoryHeap::Public(memory_type, cpu_buffer) }; Ok(n::Memory::new(heap, size))}复制代码
fn bind_buffer_memory(    &self, memory: &n::Memory, offset: u64, buffer: n::UnboundBuffer) -> Result
{ debug!("bind_buffer_memory of size {} at offset {}", buffer.size, offset); let (raw, options, range) = match memory.heap { n::MemoryHeap::Native(ref heap) => { let resource_options = conv::resource_options_from_storage_and_cache( heap.storage_mode(), heap.cpu_cache_mode(), ); let raw = heap.new_buffer(buffer.size, resource_options) .unwrap_or_else(|| { // TODO: disable hazard tracking? self.shared.device .lock() .new_buffer(buffer.size, resource_options) }); (raw, resource_options, 0 .. buffer.size) //TODO? } n::MemoryHeap::Public(mt, ref cpu_buffer) => { debug!("\tmapped to public heap with address {:?}", cpu_buffer.as_ptr()); let (storage, cache) = MemoryTypes::describe(mt.0); let options = conv::resource_options_from_storage_and_cache(storage, cache); (cpu_buffer.clone(), options, offset .. offset + buffer.size) } n::MemoryHeap::Private => { //TODO: check for aliasing let options = MTLResourceOptions::StorageModePrivate | MTLResourceOptions::CPUCacheModeDefaultCache; let raw = self.shared.device .lock() .new_buffer(buffer.size, options); (raw, options, 0 .. buffer.size) } }; Ok(n::Buffer { raw, range, options, })}复制代码
let size = data_source.len() as u64;let mut data_target = device.acquire_mapping_writer::
(&memory, 0..size).unwrap();data_target[0..data_source.len()].copy_from_slice(data_source);let _ = device.release_mapping_writer(data_target);复制代码
/// Acquire a mapping Writer.////// The accessible slice will correspond to the specified range (in bytes).fn acquire_mapping_writer<'a, T>(    &self,    memory: &'a B::Memory,    range: Range
,) -> Result
<'a, B, T>, mapping::Error> where T: Copy,{ let count = (range.end - range.start) as usize / mem::size_of::
(); self.map_memory(memory, range.clone()).map(|ptr| unsafe { let start_ptr = ptr as *mut _; mapping::Writer { slice: slice::from_raw_parts_mut(start_ptr, count), memory, range, released: false, } })}复制代码
fn map_memory
>( &self, memory: &n::Memory, generic_range: R) -> Result<*mut u8, mapping::Error> { let range = memory.resolve(&generic_range); debug!("map_memory of size {} at {:?}", memory.size, range); let base_ptr = match memory.heap { n::MemoryHeap::Public(_, ref cpu_buffer) => cpu_buffer.contents() as *mut u8, n::MemoryHeap::Native(_) | n::MemoryHeap::Private => panic!("Unable to map memory!"), }; Ok(unsafe { base_ptr.offset(range.start as _) })}复制代码

转载地址:http://uknto.baihongyu.com/

你可能感兴趣的文章
lvs+keepalive 实现高可用软负载均衡集群
查看>>
log 的 debug()、 error()、 info()方法的区别
查看>>
php实现http与https转化
查看>>
第20件事 风险分析
查看>>
复制常用命令
查看>>
MongoDB索引管理
查看>>
Algs4-1.4.36下压栈的空间成本
查看>>
call 方法
查看>>
iOS开发--二维码的扫描
查看>>
十年技术
查看>>
bzoj 3211: 花神游历各国
查看>>
C++私有构造函数
查看>>
快捷键打开服务
查看>>
感知器神经网络
查看>>
mysql 常见的备份架构及技术
查看>>
SAS vs SSD对比测试MySQL tpch性能
查看>>
常用思科设备图标(JPG+矢量图)
查看>>
Redis主从持久化测试
查看>>
DOCKER网络代理设置
查看>>
Xamarin 学习笔记 - Page(页面)
查看>>