Hey,
Here are my notes from the Lifetimes round table we had during the memory safety workshop. Feel free to add additional notes or corrections!
Ian McCormick: Rust lifetime invariants, easy to break assumptions in C/C++. Rust has a dynamic semantics that unsafe code or foreign code must also follow. Borrow sanitizer to dynamically check for violations. Including aliasing semantics. Dynamic analysis is faster to adopt than fully static annotations. Borrow sanitizer aims to be something that can be deployed. Currently it is mostly for unit tests and testing. Miri 3 orders of magnitude slower in Rust land. This work is using tree borrows model. Cheaper than Miri because checking pointers more selectively, only pointers crossing language boundaries. Borrows only used in Rust do not need to be checked. Pointers never used from Rust do not need to uphold Rust’s invariants, they are not checked.
ASAN might also be able to cut down on the runtime overhead by eliding some checks in some scenarios when a memory safe language is used.
John McCall: Swift lifetime model only enforced within Swift. Currently, the Swift model does not have the same expressiveness as Rust. C/C++ interop to bring in some enforcements based on annotations in C/C++ headers. Interpret Clang annotations in the Swift lifetime model. The C++ side could get more enforcement. Many C++ APIs already adhere to some lifetime model it is just not explicitly described. C APIs often have more constrained interactions, often easer to reason about.
Some heuristics on the C++ side work surprisingly well to infer lifetime annotations. Swift enforces exclusivity in its programming model. So calling C++ APIs from Swift is often safer as this invariant is enforced by the caller.
Not all C++ function can be given safety properties based on these annotations.
Swift can help users adopt annotations based on some heuristics like data structures containing pointers might depend on some underlying storage. The Swift compiler can have a strict mode to make users annotate such types.
Discussion: Who is responsible for writing annotations for an API? User, author, some tooling? Can we use some static analysis to help the adoption? Yes, the flow-sensitive lifetime analysis might be able to suggest annotations for some functions. Adopting these annotations can expose existing practices that cannot be expressed at the moment.
There are multiple languages with lifetime annotations. Rust, Swift one example, more might be coming. Would be unfortunate if C++ had many lifetime annotation languages to interop with all of them. The desired scenario is to have one annotation language that works both for C++ and interoperating with other memory safe languages.
Utkarsh: Shift-left to find lifetime issues is important. The flow-sensitive lifetime analysis is bootstrapped from existing lifetime annotations. Intra-procedural. It reasons about aliasing, invalidating operations, uses. It is modelled after Rust’s polonius but is modified to better fit C++. It is not a full borrow-checker, gives no guarantees, does not attempt to find exclusivity violations. Currently, it can deal with destructors as invalidating operations. It cannot deal with other invalidating operations, we potentially need a new annotation language or effect system for that.
This analysis can help to infer lifetime annotations or to validate existing annotations in the future.
This analysis is opportunistically find use after free style issues.
Discussion:
Can we get to a point where new code is guaranteed to be free of this class of errors? The new flow sensitive analysis has a strict mode but that one also does not give guarantees. Annotations alone cannot guarantee safety, sometimes it is part of the API design. We might need to enforce a policy like mutability xor sharing to give guarantees.
What properties are easy to verify by a tool and what programs are accepted by that model?
A property like mutability xor sharing is relatively easy to verify but will reject many programs. More permissive properties that accept more correct programs are harder and harder to verify at compile time. What is the continuum between mutability xor sharing and no rules at all?
- WebKit has a Safe C++ programming model that can prevent some classes of errors
- Dealing with invalidation instead of exclusivity is one potential way to loosen the rules
- Trying to permit some benign way of mutability and sharing is another way
- To validate the safety of arbitrary code might need really heavy weight tools like separation logic that cannot be fully automated. Need advanced solvers and humans to help write proofs.
If we had guaranteed safety we’d probably need something akin to unsafe blocks in Rust.
The lifetime model for C++ and the lifetime model for interop might have slightly different goals. The lifetime analysis for interop is more important at the language boundaries and is potentially not viral to make incremental adoption easier. Interop with Rust, Swift, or lifetimes for the sake of C++, how to make everyone happy? Carbon is experimenting with a more permissive safety model. They plan to relax some of the exclusivity rules and will use effects to drive the analysis.
Do we need a common lifetime analysis that is extensible in different ways so people can tune it to their needs? Possible to have one underlying model with different levels of enforcement? Could this common underlying model be a port of Polonius?
Also, since we have a clean slate, what can we improve on? Rust has some regrets, like no way to specify that an API only borrows a field of a struct.
Would be nice to have an effect system for functions to know about invalidations, aliasing etc at the call sites.
It is OK if some C++ APIs can only be used in unsafe code in a memory safe languages. If a significant portion of these APIs can be used safely that is already a huge improvement over the status quo. Some APIs are hard to model, like map’s remove operations that invalidate iterators, but they do return a new iterator that is actually valid. Also, with some APIs the end iterator might never get invalidated. Some models can represent this fact by explicitly stating that the end iterator does not have any aliases.
One approach to deal with lifetimes is to use phantom/ghost fields. These are abstract representations for e.g., the elements of a vector. Invalidation and some other effects could be formulated in terms of these abstract fields as opposed to the implementation details.
Aliasing is hard in C++, we have iterator pairs all the time, we have many aliases.
What new C++ features could help adopting lifetimes?
- Could we change some defaults in the language? Required initialisation, reduce the need of some annotations by different defaults. We have some new defaults whenever we add annotations like bounds safety annotations assuming single pointers by default.
- Null attributes are another example where we could establish some language wide defaults by convention.
- Could we use contracts somehow?
While verifying annotations we often do not want to warn about post condition violations on paths where there was a precondition violation. Code often has defensive checks for out of contract callers.
CC-ing some people who were present just in case they have anything to add.