Symbolic/named address spaces

Hi all,

For targets that use > 1 address space (GPUs and accelerators), for IR readability, would it make sense to support symbolic or named address spaces. For NVPTX target, for example, this means that ptr addrspace(1) %x could also be printed as ptr addrspace(global) %x. This will help improve readability of the IR. If we go with what we did for ImmArg pretty printing, another possibility is ptr addrspace(/* global */1) %x so that the symbolic name is a pure comment.

I have not looked into how this can be done, but at a first order, this will be purely a pretty printing and maybe parsing feature and LLVM IR will still continue to use numbers. Additionally, this feature will be available only for modules that have a triple specified. Based on the triple, the implementation will query the number → name mappings for that triple and use that for printing and parsing. I am hoping that just having the triple will be sufficient to give meaningful names to address spaces, but folks can chime in if that’s not the case.

I wanted to check if this would be useful generally before digging into details.

Thanks
Rahul

2 Likes

I am very much +1 on this, it always bugged me that we use integer for something that should be a keyword: looks like obfuscation…

I thought about it in the past and was considering maybe module metadata to encode the mapping of string->integer to handle the printing/parsing.
Triple may work, but wouldn’t this then require the target to be available in order to parse the IR?

1 Like

Right, that’s the impl detail I don’t know (that is, is triple sufficient and can be extended to vend out this information). With the metadata option, what you are suggesting is essentially encoding this as a first-class thing in the IR, right? As an example, similar to target triple and datalayout, we can have number->name mappings represented directly in the IR (as a new target address-space-names="0:-1:global-2:shared entry at the top of the module). Here, 0 has no name and 1 and 2 have names. Maybe it should be just 1:global-2:shared.

One concern is that since this has not semantically load bearing (yet) having a new first-class thing in the IR may be too heavy weight for what it’s trying to achieve.

Nothing new: I was thinking of module metadata

Metadata attached to a module using named metadata may not be dropped,

That might work, but that means targets that want to use this have to explicitly add this metadata to the IR during one of their initial passes. With the triple-based option, I was thinking that it will work on any existing IR as well. But without needing to create a target. So we can add something like Triple::getAddressSpaceNames() similar to Triple::computeDataLayout and then use that.

This is not a dynamic program property and should not be encoded in the module. It will introduce new edge cases for the compiler to deal with

No. The target would only be required it this information was put into CodeGen, where it shouldn’t be. This is more of an IR+ABI type information which should not depend on the target, similar to the other information in TargetParser

Thanks @arsenm. Does that mean adding a Triple::getAddressSpaceNames() that returns this mapping which is then used by the IR printer (and parser) is a feasible path forward?

Probably shouldn’t implement this as a return direct 1:1 mapping, as the address space is a 24-bit value. You could implement this as a give-pretty-name string in the triple as a function of the address space number

Right, so something like StringRef Triple::getAddressSpaceName(unsigned AS). If we want parsing support, additionally std::optional<unsigned> Target::getAddressSpaceNumber(StringRef Name);. The parser/printer can cache these to avoid repeated calls if necessary. The name is required to be a single token if we want parsing support, else it can be a free form string that printed in a comment.