Deserialize a Vec of values into a struct by implementing serde `Deserializer` - serialization

I have a data format using custom enum of values. From the database I receive a Vec<MyVal>. I want to convert this to a struct (and fail if it doesn't work). I want to use serde because after processing I want to return the API response as a json, and serde makes this super easy.
Playground link for the example
enum MyVal {
Bool(bool),
Text(String)
}
#[derive(Serialize, Deserialize)]
struct User {
name: String,
registered: bool
}
The challenge is with converting the data format into the serde data model. For this I can implement a Deserializer and implement the visit_seq method i.e. visit the Vec<MyVal> as if it were a sequence and return the values one by one. The visitor for User can consume the visited values to build the struct User.
However I'm not able to figure out how to convert the Vec into something visitor_seq can consume. Here's some sample code.
struct MyWrapper(Vec<MyVal>);
impl<'de> Deserializer<'de> for MyWrapper {
type Error = de::value::Error;
// skip unncessary
fn deserialize_seq<V>(self, visitor: V) -> Result<V::Value, Self::Error>
where
V: serde::de::Visitor<'de>,
{
let convert_myval = move |myval: &MyVal| match myval {
MyVal::Bool(b) => visitor.visit_bool(*b),
MyVal::Text(s) => visitor.visit_string(s.clone())
};
// now I have a vec of serde values
let rec_vec: Vec<Result<V::Value, Self::Error>> =
self.0.iter().map(|myval| convert_myval(myval)).collect();
// giving error!!
visitor.visit_seq(rec_vec.into_iter())
}
}
The error is
92 | visitor.visit_seq(rec_vec.into_iter())
| --------- ^^^^^^^^^^^^^^^^^^^ the trait `SeqAccess<'de>` is not implemented for `std::vec::IntoIter<Result<<V as Visitor<'de>>::Value, _::_serde::de::value::Error>>`
| |
| required by a bound introduced by this call
So I looked into SeqAccess and it has an implementor that requires that whatever is passed to it implement the Iterator trait. But I thought I had that covered, because vec.into_iter returns an IntoIter, a consuming iterator which does implement the Iterator trait.
So I'm completely lost as to what's going wrong here. Surely there must be a way to visit a Vec<Result<Value, Error>> as a seq?

Preface: The question wants to treat a Rust data structure Vec<MyData> like a serialized piece of data (e.g.: like a JSON string) and allow deserializing that into any other Rust data structure that implements Deserialize. This is a quite unusual, but not without precedent. And since the MyVals are actually pieces of data with various types which get returned from a database access crate, this approach does make sense.
The main problem with the code in the question is that it tries to deserialize two different data structures (MyWrapper<Vec<MyVal>> and MyVal) with a single Deserializer. The obvious way out is to define a second struct MyValWrapper(MyVal) and implement Deserializer for it:
struct MyValWrapper(MyVal);
impl<'de> Deserializer<'de> for MyValWrapper {
type Error = de::value::Error;
fn deserialize_any<V>(self, visitor: V) -> Result<V::Value, Self::Error>
where
V: de::Visitor<'de>,
{
match self.0 {
MyVal::Bool(b) => visitor.visit_bool(b),
MyVal::Text(s) => visitor.visit_string(s.clone()),
}
}
// Maybe you should call deserialize_any from all other deserialize_* functions to get a passable error message, e.g.:
forward_to_deserialize_any! {
bool i8 i16 i32 i64 i128 u8 u16 u32 u64 u128 f32 f64 char str string
bytes byte_buf option unit unit_struct newtype_struct seq tuple
tuple_struct map struct enum identifier ignored_any
}
// Though maybe you want to handle deserialize_enum differently...
}
MyValWrapper now can deserialize a MyVal. To use MyValWrapper to deserialize a Vec of MyVals, the serde::value::SeqDeserializer adapter is convenient, it can turn an iterator of (Into)Deserializer into a sequence Deserializer:
let data: Vec<MyData> = …;
// Vec deserialization snippet
let mut sd = SeqDeserializer::new(data.into_iter().map(|myval| MyValWrapper(myval)));
let res = visitor.visit_seq(&mut sd)?;
sd.end()?;
Ok(res)
For some reason, SeqDeserializer requires the iterator items to be IntoDeserializer, but no impl<T: IntoDeserializer> Deserializer for T exists, so we need to make sure that MyValWrapper is not only a Deserializer but trivially also an IntoDeserializer:
impl<'de> IntoDeserializer<'de> for MyValWrapper {
type Deserializer = MyValWrapper;
fn into_deserializer(self) -> Self::Deserializer {
self
}
}
Finally, you need to impl Deserializer for MyWrapper (you can use the "Vec deserialization snippet" for that) — if you do actually need Deserializer to be implemented, which I suspect you don't: the SeqDeserializer already implements Deserializer, and it is a wrapper struct (just as MyWrapper is a wrapper struct). Especially, if your final goal is having a function like
fn turn_bad_myvals_into_good_T<T: DeserializeOwned>(v: Vec<MyVal>) -> T {
T::deserialize(SeqDeserializer::new(
v.into_iter().map(|myval| MyValWrapper(myval)),
))
}
then you entirely don't need MyWrapper.

Related

Rust define trait interface for any type?

I am building a rendering engine, one thing I want is to handle mangagement of arbitrary mesh data, regardless of representation.
My idea was, define a trait the enforces a function signature and then serialization can be handled by the user while I handle all the gpu stuff. This was the trait I made:
pub enum GpuAttributeData
{
OwnedData(Vec<Vec<i8>>, Vec<u32>),
}
pub trait GpuSerializable
{
fn serialize(&self) -> GpuAttributeData;
}
So very simple, give me a couple of arrays.
When i tested things inside the crate it worked, but I moved my example outside the crate, i.e. I have this snippet in an example:
impl <const N : usize> GpuSerializable for [Vertex; N]
{
fn serialize(&self) -> GpuAttributeData
{
let size = size_of::<Vertex>() * self.len();
let data = unsafe {
let mut data = Vec::<i8>::with_capacity(size);
copy_nonoverlapping(
self.as_ptr() as *const i8, data.as_ptr() as *mut i8, size);
data.set_len(size);
data
};
// let indices : Vec<usize> = (0..self.len()).into_iter().collect();
let indices = vec![0, 1, 2];
let mut buffers :Vec<Vec<i8>> = Vec::new();
buffers.push(data);
return GpuAttributeData::OwnedData(buffers, indices);
}
}
Which gives me this error:
error[E0117]: only traits defined in the current crate can be implemented for arbitrary types
--> examples/01_spinning_triangle/main.rs:41:1
|
41 | impl <const N : usize> GpuSerializable for [Vertex; N]
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^-----------
| | |
| | this is not defined in the current crate because arrays are always foreign
| impl doesn't use only types from inside the current crate
|
= note: define and implement a trait or new type instead
This utterly breaks my design. The whole point of what I am trying to achieve is, anyone anywhere should be able to implement that trait, inside or outside any crate, to be able to serialize whatever data they have, in whatever esoteric format it is.
Can I somehow bypass this restriction? If not, is there another way I could enforce at runtime "give me an object that has this function signature among its methods"?
Rust has the orphan rule, which says that any implementation of a trait on a type must exist in the same crate as at least one of the trait or the type. In other words, both types can't be foreign. (It's a bit more complex than this -- for example, you can implement a foreign generic trait on a foreign type if a generic type argument to the generic trait is a type declared in the local crate.)
This is why you frequently see so-called "newtypes" in Rust, which are typically unit structs with a single member, whose sole purpose is to implement a foreign trait on a foreign type. The newtype lives in the same crate as the implementation, so the compiler accepts the implementation.
This could be realized in your example with a newtype around the array type:
#[repr(transparent)]
struct VertexArray<const N: usize>(pub [Vertex; N]);
impl<const N: usize> Deref for VertexArray<N> {
type Target = [Vertex; N];
fn deref(&self) -> &Self::Target {
&self.0
}
}
impl<const N: usize> DerefMut for VertexArray<N> {
fn deref_mut(&mut self) -> &mut Self::Target {
&mut self.0
}
}
impl<const N: usize> GpuSerializable for VertexArray<N> {
// ...
}
Note that because of #[repr(transparent)], the layout of both VertexArray<N> and [Vertex; N] are guaranteed to be identical, meaning you can transmute between them, or references to them. This allows you to, for example, reborrow a &[Vertex; N] as a &VertexArray<N> safely, which means you can store all of your data as [Vertex; N] and borrow it as the newtype at zero cost whenever you need to interact with something expecting an implementation of GpuSerializable.
impl<const N: usize> AsRef<VertexArray<N>> for [Vertex; N] {
fn as_ref(&self) -> &VertexArray<N> {
unsafe { std::mem::transmute(self) }
}
}

How to use serde to serialize/deserialize a HashMap with enum keys that contain data? [duplicate]

This question already has answers here:
Cant serialize when using enum as key in Hashmap
(2 answers)
Closed 9 months ago.
I need to serialize and deserialize a HashMap, h, with enum Foo as a key to and from JSON. Foo's variants contain data (here simplified to u32, but actually are enums themselves):
use serde::{Serialize, Deserialize};
use serde_json;
use std::collections::HashMap;
#[derive(Serialize, Deserialize)]
enum Foo {
A(u32),
B(u32),
}
// Tried several different things here! Just deriving the relevant traits doesn't work.
struct Bar {
h: HashMap<Foo, i32>, // The i32 value type is arbitrary
}
fn main() {
let mut bar = Bar { h: HashMap::new() };
bar.h.insert(Foo::A(0), 1);
// I want to be able to do this
let bar_string = serde_json::to_string(&bar).unwrap();
let bar_deser: Bar = serde_json::from_str(&bar_string).unwrap();
}
Since the JSON specification requires keys to be strings I know I need to customise the way serialization and deserialization are done for Foo when it is a key in h. I've tried the following:
Custom implementations of Serialize and Deserialize (e.g. in the accepted answer here)
Serde attributes e.g.: #[serde(into = "String", try_from = "String")] + implementing Into<String> for Foo and TryFrom<String> for Foo (described in the answers here)
Using the 'serde_with' crate (also described in the answers here).
Unfortunately none of these have worked - all eventually failed with different panics after compiling successfully.
What is a good way to achieve what I want in serde, if there is one? If not, I'd be very grateful for any workaround suggestions.
Bonus question: why doesn't serde/serde_json provide a default serialization to a String + deserialization of an enum like Foo when it is used as a key in a HashMap?
Here is a working solution with serde_as from the serde_with helper crate:
use serde::{Deserialize, Serialize};
use serde_json;
use std::collections::HashMap;
#[derive(Serialize, Deserialize, Hash, Eq, PartialEq, Debug)]
enum Foo {
A(u32),
B(u32),
}
#[serde_with::serde_as]
#[derive(Serialize, Deserialize, Debug)]
struct Bar {
#[serde_as(as = "HashMap<serde_with::json::JsonString, _>")]
h: HashMap<Foo, i32>,
}
fn main() {
let mut bar = Bar { h: HashMap::new() };
bar.h.insert(Foo::A(0), 1);
let bar_string = serde_json::to_string(&bar).unwrap();
let bar_deser: Bar = serde_json::from_str(&bar_string).unwrap();
println!("{:?}", bar);
println!("{}", bar_string);
println!("{:?}", bar_deser);
}
Bar { h: {A(0): 1} }
{"h":{"{\"A\":0}":1}}
Bar { h: {A(0): 1} }
Bonus question: why doesn't serde/serde_json provide a default serialization to a String + deserialization of an enum like Foo when it is used as a key in a HashMap?
If you noticed, most of those problems become runtime errors.
That is because serde actually does have a representation of a map that works with all Rust values. So serde itself has no power over this problem.
The problem comes when serde_json tries to feed the map into a json Serializer. And yes, this serializer could indeed convert any key into a JSON string and back if that was implemented.
I think it's more of a design decision why this isn't done. If JSON only supports string keys, so should the JSON serializer. If the user wants to use other types as keys, he should convert them to strings first as seen in the code example above.

How does the ? operator interact with the From trait?

Say I have the following:
use std::fs::File;
impl From<i32> for Blah {
fn from(b:i32) -> Blah {
Blah {}
}
}
fn main() {}
enum MyError {
ParseError,
}
impl From<std::io::Error> for MyError {
fn from(_:std::io::Error) -> Self {
MyError::ParseError
}
}
fn get_result() -> Result<Blah, MyError> {
let mut file = File::create("foo.txt")?;
}
This compiles fine. I don't understand how.
File::create throws an std::io::error, which we're trying to wrap in a MyError. But we never explicitly call from anywhere!? How does it compile?
As the comments from this answer Rust understanding From trait indicate, you do have to explicitly call from.
So, how is the above snippet compiling?
The difference is stated in The Rust Programming Language, chapter 9, section 2, when talking about the ? operator:
Error values that have the ? operator called on them go through the from function, defined in the From trait in the standard library, which is used to convert errors from one type into another. When the ? operator calls the from function, the error type received is converted into the error type defined in the return type of the current function. This is useful when a function returns one error type to represent all the ways a function might fail, even if parts might fail for many different reasons. As long as each error type implements the from function to define how to convert itself to the returned error type, the ? operator takes care of the conversion automatically.
You have provided this implementation of From<std::io::Error> for that error type, therefore the code will work and convert values of this type automatically.
The magic is in the ? operator.
let mut file = File::create("foo.txt")?;
expands to something like (source)
let mut file = match File::create("foo.txt") {
Ok(t) => t,
Err(e) => return Err(e.into()),
};
This uses the Into trait, which is the counterpart to the From trait: e.into() is equivalent to T::from(e). Here you have the explicit conversion.
(There is an automatic impl<T, U> Into<U> for T for every impl<T, U> From<T> for U, which is why implementing From is enough.)

How can I return an iterator over a slice?

fn main() {
let vec: Vec<_> = (0..5).map(|n| n.to_string()).collect();
for item in get_iterator(&vec) {
println!("{}", item);
}
}
fn get_iterator(s: &[String]) -> Box<Iterator<Item=String>> {
Box::new(s.iter())
}
fn get_iterator<'a>(s: &'a [String]) -> Box<Iterator<Item=&'a String> + 'a> {
Box::new(s.iter())
}
The trick here is that we start with a slice of items and that slice has the lifetime 'a. slice::iter returns a slice::Iter with the same lifetime as the slice. The implementation of Iterator likewise returns references with that lifetime. We need to connect all of the lifetimes together.
That explains the 'a in the arguments and in the Item=&'a part. So what's the + 'a mean? There's a complete answer about that, and another with more detail. The short version is that an object with references inside of it may implement a trait, so we need to account for those lifetimes when talking about a trait. By default, that lifetime is 'static as it was determined that was the usual case.
The Box is not strictly required, but is a normal thing you'll see when you don't want to deal with the complicated types that might underlie the implementation (or just don't want to expose the implementation). In this case, the function could be
fn get_iterator<'a>(s: &'a [String]) -> std::slice::Iter<'a, String> {
s.iter()
}
But if you add .skip(1), the type would be:
std::iter::Skip<std::slice::Iter<'a, String>>
And if you involve a closure, then it's currently impossible to specify the type, as closures are unique, anonymous, auto-generated types! A Box is required for those cases.

Providing an implementation when both trait and type are not in this crate [duplicate]

This question already has answers here:
How do I implement a trait I don't own for a type I don't own?
(3 answers)
Closed 7 years ago.
I want to provide an implementation of a trait ToHex (not defined by me, from serialize) for a primitive type u8:
impl ToHex for u8 {
fn to_hex(&self) -> String {
self.to_str_radix(16)
}
}
The problem is I get this compiler error:
error: cannot provide an extension implementation where both trait and type are not defined in this crate
I understand the reason of this error and its logic, this is because both the trait and the primitive type are external to my code. But how can I handle this situation and provide an ToHex implementation for u8? And more generally how do you handle this kind of issue, it seems to me that this problem must be common and it should be possible and easy to extend types like this?
You should use a newtype struct to do this:
pub struct U8(pub u8)
impl ToHex for U8 {
fn to_hex(&self) -> String {
let U8(x) = *self;
x.to_str_radix(16)
}
}
This does mean, however, that you should wrap u8 into U8 where you need to perform this conversion:
let x: u8 = 127u8
// println!("{}", x.to_hex()); // does not compile
println!("{}", U8(x).to_hex());
This is absolutely free in terms of performance.
I realize this is almost a year old, but the answer was never accepted and I think I've found an alternate solution, that I thought would be good to document here.
In order to extend the functionality of the u8 through traits, instead of trying to extend ToHex, why not create a new trait?
trait MyToHex {
fn to_hex(&self) -> String;
}
impl MyToHex for u8 {
fn to_hex(&self) -> String {
format!("{:x}", *self)
}
}
then used like so
fn main() {
println!("{}", (16).to_hex());
}
This has the advantage that you don't have to wrap every u8 variable with a new and superfluous data type.
The disadvantage is that you still can't use a u8 in a external function (i.e std library, or one you have no control over) that requires the ToHex trait (Vladimir Matveev's solution works in this case), but from OP it sounds like all you want to do is extend u8 only inside your code.