By HanDong Zhang
Understanding by Module
In fact, the classification by standard library will first give you a small idea of what they do.
-
std::ops::Deref . As you can see,
Deref
is categorized as anops
module. If you look at the documentation, you will see that this module defines the trait fr all the overloadable operators. For example,Add trait
corresponds to+
, whileDeref trait
corresponds to a shared(immutable) borrowing dereference operation, such as*v
. Correspondingly, there is also theDerefMut trait
, which corresponds to the dereferencing operation of exclusive(mutable) borrowing. Since the Rust ownership semantics is a language feature throughout , the semantics ofOwner
/immutable borrowing(&T)
/mutable borrowing(&mut T)
all appear together. -
std::convert::AsRef . As you can see,
AsRef
is grouped under the convert module. if you look at the documentation, you will see that traits related to type conversions are defined in this module. For example, the familiar "From/To", "TryFrom/TryTo" and "AsRef/AsMut" also appear in pairs here, indicating that the feature is releated to type conversions. Based on the naming rules in the Rust API Guidelines , wen can infer that methods starting withas_
represent conversions fromborrow -> borrow
, i.e,reference -> reference
, and are overhead-free. And such conversions do not fail. -
std::borrow::Borrow. As you can see,
Borrow
is categorized in the borrow module. The documentation for this module is very minimal, with a single sentence saying that this is for using borrowed data. So the trait is more or less related to expressing borrwo semantics. Three traits are provided: Borrow / BorrowMut/ ToOwned , which corresponds exactly to the ownership semantics. -
std::borrow::Cow. It can be seen that
Cow
is also classified as a borrow module. According to the description,Cow
is a clone-on-write smart pointer. The main reason for putting it in the borrow module is to use borrowing as much as possible and avoid copying, as an optimization.
std::ops::Deref
First, let's look at the definition of the trait.
pub trait Deref {
type Target: ?Sized;
#[must_use]
pub fn deref(&self) -> &Self::Target;
}
The definition is not complicated, Deref
contains only a deref
method signature. The beauty of this trait is that it is called "implicitly" by the compiler, officially called "deref coercion ".
Here is an example from the standard library.
use std::ops::Deref;
struct DerefExample<T> {
value: T
}
impl<T> Deref for DerefExample<T> {
type Target = T;
fn deref(&self) -> &Self::Target {
&self.value
}
}
let x = DerefExample { value: 'a' };
assert_eq!('a', *x);
In the code, the DerefExample
structure implements the Deref
trait, so it can be executed using the dereference operator *
. In the example, the value of the field value is returned directly.
As you can see, DerefExample
has a pointer-like behavior , because it implements Deref
, because it can be dereferenced. DerefExample
also becomes a kind of smart pointer. This is one way to identify if a type is a smart pointer, by seeing if it implements Deref
. But not all smart pointers implement Deref
, some implent Drop
, or both.
Now let's summarize Deref
.
If T
implements Deref<Target=U>
, and x
is an instance of type T
, then.
- In an immutable context, the operation of
*x
(whenT
is neither a reference nor a primitive pointer) is equivalent to*Deref::deref(&x)
. - The value of
&T
is forced to be converted to the value of&U
. (deref coercion). -
T
implements all the (immutable) methods ofU
.
The beauty of Deref
is that it enhances the Rust development experience. A typical example from the standard library is that Vec<T>
shares all the methods of slice
by implemented Deref
.
impl<T, A: Allocator> ops::Deref for Vec<T, A> {
type Target = [T];
fn deref(&self) -> &[T] {
unsafe { slice::from_raw_parts(self.as_ptr(), self.len) }
}
}
For example, the simplest method, len()
, is actually defined in the slice
module. In Rust, when executing .
call, or at the function argument position, the compiler automatically performs the implicit act of deref coercion. so it is equivalent to Vec<T>
having the slice
method as well.
fn main() {
let a = vec![1, 2, 3];
assert_eq!(a.len(), 3); // 当 a 调用 len() 的时候,发生 deref 强转
}
Implicit behavior in Rust is not common, but Deref
is one of them, and its implicit coercion make smart pointers easy to use.
fn main() {
let h = Box::new("hello");
assert_eq!(h.to_uppercase(), "HELLO");
}
For example, if we manipulate Box<T>
, instead of manually dereferencing T
inside to manipulate it, as if the outer layer of Box<T>
is transparent, we can manipulate T directly.
Another example.
fn uppercase(s: &str) -> String {
s.to_uppercase()
}
fn main() {
let s = String::from("hello");
assert_eq!(uppercase(&s), "HELLO");
}
The argument type of the uppercase
method above is obviously &str
, but the actual type passed in the main function is &String
, so why does it compile successfully? It is because String
implementsDeref
.
impl ops::Deref for String {
type Target = str;
#[inline]
fn deref(&self) -> &str {
unsafe { str::from_utf8_unchecked(&self.vec) }
}
}
That's the beauty of Deref
. But some people may mistake it for inheritance. Big mistake.
This behavior seems a bit like inheritance, but please don't just use Deref
to simulate inheritance.
std::convert::AsRef
Let's look at the definition of AsRef
.
pub trait AsRef<T: ?Sized> {
fn as_ref(&self) -> &T;
}
We already know that AsRef
can be used for conversions. Compared to Deref
, which has an implicit behavior, AsRef
is an explicit conversion.
fn is_hello<T: AsRef<str>>(s: T) {
assert_eq!("hello", s.as_ref());
}
fn main() {
let s = "hello";
is_hello(s);
let s = "hello".to_string();
is_hello(s);
}
In the above example, the function of is_hello
is a generic function. The conversion is achieved by qualifying T: AsRef<str>
and using an explicit call like s.as_ref()
inside the function. Either String
or str
actually implements the AsRef
trait.
So now the question is, when do you use AsRef
? Why not just use &T
?
Consider an example like this.
pub struct Thing {
name: String,
}
impl Thing {
pub fn new(name: WhatTypeHere) -> Self {
Thing { name: name.some_conversion() }
}
In the above example, the new
function name has the following options for the type parameter.
-
&str
. In this case, the caller needs to pass in a reference. But in order to convert to String, the called party (callee) needs to control its own memory allocation, and will have a copy. -
String
. In this case, the caller is fine passing String, but if it is passing a reference, it is similar to case 1. -
T: Into<String>
. In this case, the caller can pass&str
andString
, but there will be memory allocation and copying during the type conversion as well. -
T: AsRef<str>
. Same as case 3. -
T: Into<Cow<'a, str>>
, where some allocations can be avoided.Cow
will be described later.
There is no one-size-fits-all answer to the question of when to use which type. Some people just like &str
and will use it no matter what. There are trade-offs here.
- On occasions when assignment and copying are less important, there is no need to make type signatures too complicated, just use
&str
. - Some need to look at method definitions and whether they need to consume ownership, or return ownership or borrowing.
- Some need to minimize assignment and copy, so it is necessary to use more complex type signatures, as in case 5.
Application of Deref and AsRef in API design
The wasm-bindgen library contains a component called web-sys.
This component is the binding of Rust to the browser Web API. As such, web-sys makes it possible to manipulate the browser DOM with Rust code, fetch server data, draw graphics, handle audio and video, handle client-side storage, and more.
However, binding Web APIs with Rust is not that simple. For example, manipulating the DOM relies on JavaScript class inheritance, so web-sys must provide access to this inheritance hierarchy. In web-sys, access to this inheritance structure is provided using Deref
and AsRef
.
Using Deref
let element: &Element = ...;
element.append_child(..); // call a method on `Node`
method_expecting_a_node(&element); // coerce to `&Node` implicitly
let node: &Node = &element; // explicitly coerce to `&Node`
If you have web_sys::Element
, then you can get web_sys::Node
implicitly by using deref.
The use of deref is mainly for API ergonomic reasons, to make it easy for developers to use the .
operation to transparently use the parent class.
Using AsRef
A large number of AsRef
conversions are also implemented in web-sys for various types.
impl AsRef<HtmlElement> for HtmlAnchorElement
impl AsRef<Element> for HtmlAnchorElement
impl AsRef<Node> for HtmlAnchorElement
impl AsRef<EventTarget> for HtmlAnchorElement
impl AsRef<Object> for HtmlAnchorElement
impl AsRef<JsValue> for HtmlAnchorElement
A reference to a parent structure can be obtained by explicitly calling .as_ref()
.
Deref focuses on implicitly and transparently using the parent structure, while AsRef focuses on explicitly obtaining a reference to the parent structure. This is a trade-off with a specific API design, rather than a mindless simulation of OOP inheritance.
Another example of using AsRef is the http-types library, which uses AsRef and AsMut to convert various types.
For example, Request is a combination of Stream / headers/ URL
, so it implements AsRef<Url>
, AsRef<Headers>
, and AsyncRead
. Similarly, Response is a combination of Stream / headers/ Status Code
. So it implements AsRef<StatusCode>
, AsRef<Headers>
, and AsyncRead
.
fn forwarded_for(headers: impl AsRef<http_types::Headers>) {
// get the X-forwarded-for header
}
// 所以,forwarded_for 可以方便处理 Request/ Response / Trailers
let fwd1 = forwarded_for(&req);
let fwd2 = forwarded_for(&res);
let fwd3 = forwarded_for(&trailers);
std::borrow::Borrow
Take a look at the definition of Borrow
.
pub trait Borrow<Borrowed: ?Sized> {
fn borrow(&self) -> &Borrowed;
}
Contrast AsRef
:
pub trait AsRef<T: ?Sized> {
fn as_ref(&self) -> &T;
}
Isn't this very similar? So, some people suggest that one of these two functions could be removed altogether. But in fact, there is a difference between Borrow and AsRef, and they both have their own uses.
The Borrow trait is used to represent borrowed data. the AsRef trait is used for type conversion. In Rust, it is common to provide different type representations for different use cases for different semantics.
A type provides a reference/borrow to T
in the borrow()
method by implementing Borrow<T>
, expressing the semantics that it can be borrowed, rather than converted to some type T
. A type can be freely borrowed as several different types, or it can be borrowed in a mutable way.
So how do you choose between Borrow and AsRef?
- Choose Borrow when you want to abstract different borrow types in a uniform way, or when you want to create a data structure that handles self-contained values (owned) and borrowed values (borrowed) in the same way.
- When you want to convert a type directly to a reference and you are writing generic code, choose AsRef. simpler case.
In fact, the HashMap example given in the standard library documentation explains this very well. Let me translate it for you.
HashMap<K, V>
stores key-value pairs, and its API should be able to retrieve the corresponding value in the HashMap properly using either the key's own value or its reference. Since the HashMap has to hash and compare keys, it must require that both the key's own value and the reference behave the same when hashed and compared.
use std::borrow::Borrow;
use std::hash::Hash;
pub struct HashMap<K, V> {
// fields omitted
}
impl<K, V> HashMap<K, V> {
// The insert method uses the key's own value and takes ownership of it.
pub fn insert(&self, key: K, value: V) -> Option<V>
where K: Hash + Eq
{
// ...
}
// If you use the get method to get the corresponding value by key, you can use the reference of key, which is denoted by &Q here
// and requires Q to satisfy `Q: Hash + Eq + ?Sized`
// As for K, it is expressed as a borrowed data of Q by `K: Borrow<Q>`.
// So, the hash implementation of Q is required to be the same as K
pub fn get<Q>(&self, k: &Q) -> Option<&V>
where
K: Borrow<Q>,
Q: Hash + Eq + ?Sized
{
// ...
}
}
Borrow is a bound on borrowed data and is used with additional traits, such as Hash
and Eq
in the example.
See another example.
// Can this structure be used as the key of a HashMap?
pub struct CaseInsensitiveString(String);
// It implements PartialEq without problems
impl PartialEq for CaseInsensitiveString {
fn eq(&self, other: &Self) -> bool {
// Note that the comparison here is required to ignore ascii case
self.0.eq_ignore_ascii_case(&other.0)
}
}
impl Eq for CaseInsensitiveString { }
// Implementing Hash is no problem
// But since PartialEq ignores case, the hash calculation must also ignore case
impl Hash for CaseInsensitiveString {
fn hash<H: Hasher>(&self, state: &mut H) {
for c in self.0.as_bytes() {
c.to_ascii_lowercase().hash(state)
}
}
}
Can CaseInsensitiveString implement Borrow<str>
?
Obviously, CaseInsensitiveString and str have different implementations of Hash. str does not ignore case. Therefore, Borrow<str>
must not be implemented for CaseInsensitiveString, so CaseInsensitiveString cannot be used as a key for a HashMap. What happens if we force Borrow<str>
to be used? It will fail due to case difference when determining the key.
But CaseInsensitiveString can be fully implemented as AsRef.
This is the difference between Borrow and AsRef. Borrow
is a bit stricter and represents a completely different semantics than AsRef
.
std::borrow::Cow
Look at the definition of Cow
.
pub enum Cow<'a, B>
where
B: 'a + ToOwned + ?Sized,
{
Borrowed(&'a B),
Owned(<B as ToOwned>::Owned),
}
As you can see, Cow is an enumeration. It is somewhat similar to Option, in that it represents one of two cases, Cow here means borrowed and self-owned, but only one of these cases can occur.
The main functions of Cow are:
- acts as a smart pointer, providing transparent immutable access to instances of this type (e.g. the original immutable methods of this type can be called directly, implementing Deref, but not DerefMut).
- if there is a need to modify an instance of this type, or to gain ownership of an instance of this type,
Cow
provides methods to do cloning and avoid repeated cloning.
Cow
is designed to improve performance (reduce replication) while increasing flexibility, because most of the time, business scenarios are read more and write less. With Cow
, this can be achieved in a uniform, canonical form, where object replication is done only once when a write is needed. This may reduce the number of replications significantly.
It has the following key points to master.
-
Cow<T>
can directly call the immutable methods ofT
, sinceCow
, an enumeration, implementsDeref
. - the
.to_mut()
method can be used to obtain a mutable borrow with an ownership value whenT
needs to be modified.- note that a call to
.to_mut()
does not necessarily result in a Clone. - calling
.to_mut()
when ownership is already present is valid, but does not produce a new Clone. - multiple calls to
.to_mut()
will produce only one Clone.
- note that a call to
-
.into_owned()
can be used to create a new owned object whenT
needs to be modified, a process that often implies a memory copy and the creation of a new object.- calling this operation will perform a Clone if the value in the previous
Cow
was in borrowed state. - this method, whose argument is of type
self
, will "consume" the original instance of that type, after which the life cycle of the original instance of that type will end, and cannot be called more than once onCow
.
- calling this operation will perform a Clone if the value in the previous
Cow is used more often in API design.
use std::borrow::Cow;
// Use Cow for the return value to avoid multiple copies
fn remove_spaces<'a>(input: &'a str) -> Cow<'a, str> {
if input.contains(' ') {
let mut buf = String::with_capacity(input.len());
for c in input.chars() {
if c != ' ' {
buf.push(c);
}
}
return Cow::Owned(buf);
}
return Cow::Borrowed(input);
}
Of course, when to use Cow comes back to the "when to use AsRef
" discussion in our previous article, there are trade-offs and no one-size-fits-all standard answer.
Summary
To understand the various types and traits in Rust, you need to take into account the ownership semantics and ponder the documentation and examples, which should be easy to understand. I don't know if reading this article has solved your doubts? Feel free to share your feedback.
Top comments (0)