2

I'm writing a MQTT5 library. To send a packet, I need to know the size of the payload before writing the payload. My solution for determining the size has the following constraints order by importance:

  1. be easy to maintain
  2. should not create copies of the data
  3. should be fairly performant (avoid double calculations)

To determine the size I can do any of the following solutions:

  1. do the calculations by hand, which is fairly annoying
  2. hold a copy of the data to send in memory, which I want to avoid
  3. Build an std::iter::ExactSizeIterator for the payload which consists of std::iter::Chains itself, which leads to ugly typings fast, if you don't create wrapper types

I decided to go with version 3.

The example below shows my try on writing a MQTT String iterator. A MQTT String consists of two bytes which are the length of the string followed by the data as utf8.

use std::iter::*;
use std::slice::Iter;

pub struct MQTTString<'a> {
    chain: Chain<Iter<'a, u8>, Iter<'a, u8>>,
}

impl<'a> MQTTString<'a> {
    pub fn new(s: &'a str) -> Self {
        let u16_len = s.len() as u16;
        let len_bytes = u16_len.to_be_bytes();
        let len_iter = len_bytes.iter(); // len_bytes is borrowed here

        let s_bytes = s.as_bytes();
        let s_iter = s_bytes.iter();

        let chain = len_iter.chain(s_iter);

        MQTTString { chain }
    }
}

impl<'a> Iterator for MQTTString<'a> {
    type Item = &'a u8;
    fn next(&mut self) -> Option<&'a u8> {
        self.chain.next()
    }
}

impl<'a> ExactSizeIterator for MQTTString<'a> {}

pub struct MQTTStringPait<'a> {
    chain: Chain<std::slice::Iter<'a, u8>, std::slice::Iter<'a, u8>>,
}

This implementation doesn't compile because I borrow len_bytes instead of moving it, so it'd get dropped before the Chain can consume it:

error[E0515]: cannot return value referencing local variable `len_bytes`
  --> src/lib.rs:19:9
   |
12 |         let len_iter = len_bytes.iter(); // len_bytes is borrowed here
   |                        --------- `len_bytes` is borrowed here
...
19 |         MQTTString { chain }
   |         ^^^^^^^^^^^^^^^^^^^^ returns a value referencing data owned by the current function

Is there a nice way to do this? Adding len_bytes to the MQTTString struct doesn't help. Is there a better fourth option of solving the problem?

1 Answer 1

2

The root problem is that iter borrows the array. In nightly Rust, you can use array::IntoIter, but it does require that you change your iterator to return u8 instead of &u8:

#![feature(array_value_iter)]

use std::array::IntoIter;
use std::iter::*;
use std::slice::Iter;

pub struct MQTTString<'a> {
    chain: Chain<IntoIter<u8, 2_usize>, Copied<Iter<'a, u8>>>,
}

impl<'a> MQTTString<'a> {
    pub fn new(s: &'a str) -> Self {
        let u16_len = s.len() as u16;
        let len_bytes = u16_len.to_be_bytes();
        let len_iter = std::array::IntoIter::new(len_bytes);

        let s_bytes = s.as_bytes();
        let s_iter = s_bytes.iter().copied();

        let chain = len_iter.chain(s_iter);

        MQTTString { chain }
    }
}

impl<'a> Iterator for MQTTString<'a> {
    type Item = u8;
    fn next(&mut self) -> Option<u8> {
        self.chain.next()
    }
}

impl<'a> ExactSizeIterator for MQTTString<'a> {}

You could do the same thing in stable Rust by using a Vec, but that'd be a bit of overkill. Instead, since you know the exact size of the array, you could get the values and chain more:

use std::iter::{self, *};
use std::slice;

pub struct MQTTString<'a> {
    chain: Chain<Chain<Once<u8>, Once<u8>>, Copied<slice::Iter<'a, u8>>>,
}

impl<'a> MQTTString<'a> {
    pub fn new(s: &'a str) -> Self {
        let u16_len = s.len() as u16;
        let [a, b] = u16_len.to_be_bytes();

        let s_bytes = s.as_bytes();
        let s_iter = s_bytes.iter().copied();

        let chain = iter::once(a).chain(iter::once(b)).chain(s_iter);

        MQTTString { chain }
    }
}

impl<'a> Iterator for MQTTString<'a> {
    type Item = u8;
    fn next(&mut self) -> Option<u8> {
        self.chain.next()
    }
}

impl<'a> ExactSizeIterator for MQTTString<'a> {}

See also:


An iterator of &u8 is not a good idea from the point of view of pure efficiency. On a 64-bit system, &u8 takes up 64 bits, as opposed to the 8 bits that the u8 itself would take. Additionally, dealing with this data on a byte-by-byte basis will likely impede common optimizations around copying memory around.

Instead, I'd recommend creating something that can write itself to something implementing Write. One possible implementation:

use std::{
    convert::TryFrom,
    io::{self, Write},
};

pub struct MQTTString<'a>(&'a str);

impl MQTTString<'_> {
    pub fn write_to(&self, mut w: impl Write) -> io::Result<()> {
        let len = u16::try_from(self.0.len()).expect("length exceeded 16-bit");
        let len = len.to_be_bytes();
        w.write_all(&len)?;
        w.write_all(self.0.as_bytes())?;
        Ok(())
    }
}

See also:

Sign up to request clarification or add additional context in comments.

4 Comments

This is really nice and helped a lot, but using a copied iterator creates a copy of all data if I'm not mistaken. I know there's a problem with u8 and &u8 and they can't be mixed, but a solution only referencing the original data would be nice.
I took a look at your linked answer and I think I will create a struct which converts a string into an iterator like you demonstrated there for the pixel.
@Snapstromegon I plan on addressing this in the post after my lunch, but a &u8 is a really poor choice. It’s 64bits big and requires an indirect memory access, and then the consumer of the iterator still has to copy it to the network buffer anyway.
Your 64bit comment made me rethink. Storing the payload as chunks in a Vec<&[u8]> and using that as a stack would probably solve all issues...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.