State Machines and the Strange Case of Mutating API

sriram_malhar · on Dec 24, 2018

This is precisely the problem addressed by typestates and session types. Each state has a unique type, and in that state, one can only call the appropriate functions.

Interestingly, Rust used to have typestates in its very early incarnation. See: https://pcwalton.github.io/2012/12/26/typestate-is-dead.html

Sharlin · on Dec 24, 2018

Rust's affine typing solves precisely this problem as well. Your state change methods consume the previous state variable, making it a compile-time error to access it afterwards. Given code like in the article and assuming connect_unauthenticated takes self by value:

  let i1 = socks5_socket();
  let i2 = i1.connect_unauthenticated(proxy_addr);
  let i3 = i1.connect_tcp(addr);

Rust will complain:

  error[E0382]: use of moved value: `i1`
    |
  2 |     let i2 = i1.connect_unauthenticated(proxy_addr);
    |             - value moved here
  3 |     let i3 = i1.connect_tcp(addr);
    |              ^ value used here after move
    |
    = note: move occurs because `i1` has type `socks5_socket`, which does not implement the `Copy` trait

sriram_malhar · on Dec 25, 2018

Yes, between traits and alias control, you have everything required to solve OP's problem safely and efficiently. The only remaining aspect, I feel, is aesthetic, as the errors show; they show aliasing/ownership errors, but not indicative of operations on a single state machine.

goldenkey · on Dec 25, 2018

That is really marvelous. How is this kind of thing implemented under the hood?

That new wizkid language Zig by Andrew Kelly - I wonder if it has anything similar?

a1369209993 · on Dec 25, 2018

You can actually do the type-checking part of this even in C, it's just uglier:

  struct sock5_blank i1 = socks5_socket();
  struct sock5_authed i2 = connect_unauthenticated(i1,proxy_addr);
  struct sock5_tcp i3 = connect_tcp(i1,addr); /* error: expected struct sock5_authed, got struct sock5_blank */

Rust adds the constraint that any (non-copyable,non-dropable) variable has to be passed to a function (or operator, or return statement) exactly once:

  Noncopyable a = ...;
  Copyable b = ...;
  f(a,b); //=> f(a,copy(&b)) // a is no longer in scope after this
  g(b); // this is fine, we took a reference to b and made a copy without changing the original
  h(a); // error: a is no longer a live variable

rumcajz · on Dec 25, 2018

The problem with the C solution is that you'll get hanging pointers. Try using i1 after the code above executes and you'll get a runtime error.

a1369209993 · on Dec 25, 2018

Er, yes; that's what I said (or rather, hanging various things including but not limited to pointers). C only supported the typechecking part of this because it doesn't have any facilities for compile-time linearity checking.

Linearity[0] and dependency[1] checking is what makes Rust a improvement over C, rather than yet another "let's reinvent C++, but slightly less awful" project.

0: "error: no implementation for copy〈T〉","error: no implementation for drop〈T〉"[2]

1: "error: cannot move/destroy x (of type T); y (of type T&'x) outlives it"

capitalsigma · on Dec 25, 2018

You can use C++ unique_ptr and turn on Clang warnings about "use after move", if you want that level of protection.

tiuPapa · on Dec 25, 2018

I think this is a feature of Rust's ownership model.

xaedes · on Dec 24, 2018

> What options do we have to support such mutating API at the moment?

What is wrong with implementing the states with polymorphic classes providing the API or an interface to it?

geezerjay · on Dec 25, 2018

> What is wrong with implementing the states with polymorphic classes providing the API or an interface to it?

Yeah that's pretty much the basic approach to this problem: use a strategy pattern in all states of the state machine, and just provide definitions for each method in the state that the method makes sense.

That's pretty much how HTTP-based protocols are implemented when status codes drive a state machine.

I wonder why the author decided to ignore the most basic and simple solutions to a frequent problem in favour of a convoluted solution that doesn't add anything and is harder to debug.

rahulmehta95 · on Dec 24, 2018

I was thinking that using Visitors would be a solution, which is a form of implementing a polymorphic API.

rumcajz · on Dec 24, 2018

Can you give an example?

sovande · on Dec 24, 2018

Apple's GameplayKit has several examples

https://developer.apple.com/library/archive/documentation/Ge...

typedef_struct · on Dec 26, 2018

I'd probably do it like this:

  socks5_socket([&](auto closed_socket) {
      closed_socket.connect_unauthenticated(proxy_addr, [&](auto authenticated_socket) {
          authenticated_socket.connect_tcp(addr, [&](auto tcp_socket) {
              //tcp_socket.send or w/e
          });
      });
  });

ppppppaul · on Dec 25, 2018

problem solved via pattern matching (visitor pattern in oop).