Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT] Initial socket support #30

Draft
wants to merge 27 commits into
base: main
Choose a base branch
from

Conversation

milseman
Copy link
Contributor

@milseman milseman commented Feb 27, 2021

Draft of SocketDescriptor and support for:

Prototyped

  • socket, with domains (protocol/address families) and connection types (stream vs datagram)
  • send/recv, with message flags
  • listen
  • shutdown
  • Socket options (socket, ipv4, ipv6, and tcp) and getsockopt and setsockopt
  • helpers, akin to FileDescriptor's helpers
  • getaddrinfo-like functionality
  • sockaddr_t and friends, and syscalls taking sockaddr_t.
  • system-samples command line tool

TODO

  • unavailable-renamed entries for constants,
  • Linux and Windows ports
  • Better testing and mocking of socket options

Questions

  • What is interruptible and what isn't?
  • What should be Codable?
    • What is the policy, is it for same-process, cross-process, cross-system, or cross-platform serialization?
  • Should we instead have a TCP, IP, IPv6 namespace for socket options?
    • Then, level and option_name are in effect rolled into one parameter
    • Are the IPv4 options usable on IPv6 sockets? Should they have v4 in their names?

Deferred

  • network byte ordering operations (if any beyond stdlib)
  • gethostent, getnetent, getprotoent, getservent, etc., unless needed for sockaddr_t
  • failable check if a FileDescriptor really is a socket (stat)
  • Bool or simple value-oriented API for options

#elseif os(Windows)
@_implementationOnly import ucrt
import CSystem
import ucrt
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, Windows is going to be more complicated :-(. It is going to drag in WinSDK (which is effectively Darwin on macOS). The socket interfaces are not part of the C library but an entirely separate framework. The problem is that the module (as currently setup) will potentially drag in additional unnecessary link dependencies. I wonder if we want to add a local module to pull in the WinSock interfaces as @_implementationOnly.

We will definitely need an additional shim for adding a module constructor to initialize WinSDK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, is linking against WinSDK a problem? Could the shim module link against just Ws2_32.lib?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WinSDK is not a library but rather a hand rolled module that encompasses all of the Windows SDK. Yes, the shim module can just link against Ws2_32.lib and that is sufficient. I'm just worried about overlinking, but, I suppose that it should be easy to check. The WinSDK module isn't great though and needs to be improved (it has been built up organically and tactically as we dont have good control over the SDK).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we dynamically link to it or dlsym equivalent? I don't really know Windows best practices or conventions, but my recollection is that it was shockingly in favor of hot loading .dlls.

Copy link
Member

@lorentey lorentey Mar 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, forcing unnecessary linkages on people would be a significant obstacle to adoption. (E.g., see corelibs-foundation.) There ought to be no reasonable objection against adopting swift-system.

I expect at some point we will have to take the step of splitting up this package into multiple system-dependent modules, with the top-level SystemPackage module reexporting each component.

It should be possible to use the core APIs in this package without linking with winsocks.

What modules (if any) need to be spun off is different on every platform, and the package should fully reflect that.

@milseman milseman mentioned this pull request Feb 28, 2021
@milseman milseman force-pushed the network_receptacles branch from 18b96f6 to c11d8fa Compare February 28, 2021 03:17
@milseman
Copy link
Contributor Author

milseman commented Feb 28, 2021

edit: No, without struct sub typing, trying to separate socket options by level is a lot more machinery and generates confusion.

Copy link

@kylemacomber kylemacomber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some superficial comments

/// Treat `fd` as a socket descriptor, without checking with the operating
/// system that it actually refers to a socket
@_alwaysEmitIntoClient
public init(unchecked fd: FileDescriptor) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if unchecked is the right argument label to use here. It makes me think of Range(uncheckedBounds:), in which case the checking is a bounds check precondition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Range, is that precondition later checked? Here, the failure mode is "recoverable", i.e. Errno is thrown, but it is still an operation that doesn't check that the result is valid until use. "unsafe" is flat out as this will be checked and does not produce UB.

Copy link
Member

@lorentey lorentey Mar 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No -- the unchecked in Range(uncheckedBounds:) indicates that failing to fulfill the unchecked precondition will lead to undefined behavior.

This is not the case here at all -- the difference between FileDescriptor and SocketDescriptor is a figment of our own imagination; translating between the two can never lead to UB.

Sources/System/Sockets/SocketOperations.swift Outdated Show resolved Hide resolved
Sources/System/Sockets/SocketOperations.swift Outdated Show resolved Hide resolved
}
if scopeid {
flags.insert(.scopeIdentifier)
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@natecook1000 , does the argument parser allow adding members to an OptionSet directly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No - it would be really cool to have that as a feature, where a group of flags could all be stored together in an OptionSet type. You’d need to provide the metadata specifically, since those aren’t typically CaseIterable the way enums often are.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is something you’re using multiple places, you could move the flags to a separate type, vend the option set as a property, and then include it as an @OptionGroup.


/// TODO
@frozen
public struct SocketDescriptor: RawRepresentable, Hashable {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how are we later on doing kqueue/epoll where we can add sockets and pipes (and for kqueue also file descriptors (although they don't work like you'd expect to))?

Copy link
Member

@lorentey lorentey Mar 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do that either by requiring the user to pass socket.fileDescriptor to these constructs, or by adding direct overloads that take a SocketDescriptor, or by introducing a protocol for these things. Note that the latter option is highly unlikely, I think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lorentey Right. The "issue" with this is that for epoll for example, you should only pass pipes/sockets/ttys etc. If you pass a regular file, then it won't work (and error, think even EBADF :) ). So given that we have some form of typing, it's kinda sad to remove it to go to epoll which internally does require the right type. But maybe this is a step too far for System

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't explored kqueue yet, but I suspect we'll follow a similar strategy here with a custom descriptor type. We might end up with rich type hierarchies for the different kinds of descriptors, so epoll would only receive pipe-family descriptors, but that's TBD.

If Swift had struct subtyping, this would all be way easier to model (as would many other things), but we can instead rely on conventions and protocols.

Copy link
Member

@lorentey lorentey Mar 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The distinction between SocketDescriptor and FileDescriptor is just a namespacing convenience -- it's not real. If we add a KqueueDescriptor, it'll also be just a namespacing construct.

The primary goal of System is to eliminate the difficulty of calling syscalls from Swift, and especially to fix the associated memory safety issues.

Building abstractions that don't directly map to the underlying interfaces aggressively works against this goal -- however desirable they might be. Such abstractions are important (dare I say, crucial) to have, but they belong in higher layers such as NIO.

Other than ensuring memory safety, System cannot afford to be strongly opinionated about how people want to use syscalls; otherwise some folks will be forced to continue calling the unsafe C originals, and memory management / buffer overflow issues will be a problem forever.

Making it more difficult to call syscalls just to prevent EBADF/ENOTSOCK from getting thrown is definitely a non-goal to me. It's not unsafe to call epoll on a regular file; we'll just get an error thrown at runtime. This is like calling seek on a terminal device or socket (in fact, that's even "worse", as seek doesn't even throw an error).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have also been working on Sockets support for System (mainly for Bluetooth and Netlink on Linux, not IPv4). Just for some context we are building embedded devices with Swift 5 and Yocto / Buildroot, so all our Swift packages are eventually vendored as .so shared libraries and system frameworks.

I strongly believe that adding a duplicate FileDescriptor wrapper for sockets (what about pipes) defeats the purpose of this API and will lead to code duplication. IMHO the System library should be using mostly by library vendors and not most end users directly, and other abstractions like memory management (automatically closing FDs) should be done in another layer.

https://github.com/PureSwift/swift-system/blob/feature/network/Sources/System/SocketOperations.swift

https://github.com/PureSwift/BluetoothLinux
https://github.com/PureSwift/Netlink

@lorentey
Copy link
Member

lorentey commented Mar 3, 2021

What should be Codable?

I think we can make SocketAddress.IPv4.Address, SocketAddress.IPv6.Address and SocketAddress.Port codable. The addresses should probably serialize in String form.

Codable for SocketAddress.IPv4, SocketAddress.IPv6 and SocketAddress.Local may also make sense, using a keyed container for the various components. However, I believe the underlying sockaddr_* structs may have system-dependent components that wouldn't necessarily survive serialization, so it's probably best not to do it.

None of the constant enums or other structs seem Codable to me.


/// Specify the part (or all) of a full-duplex connection to shutdown.
@frozen
public struct ShutdownKind: RawRepresentable, Hashable, Codable, CustomStringConvertible {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll want to get rid of Codable here.

@millenomi
Copy link

To answer your Codable question: Codable abstracts your serialization format from your type structure, and thus is suitable for all kinds of serialization, including cross-process and cross-machine. (One of its primary use cases is to parse JSON coming from arbitrary servers, for instance.)

Copy link

@dfunckt dfunckt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happened upon a couple typos and thought I'd mark them even though that part appears to be WIP.

Sources/System/Sockets/SocketOptions.swift Outdated Show resolved Hide resolved
Sources/System/Sockets/SocketOptions.swift Outdated Show resolved Hide resolved
milseman and others added 16 commits April 10, 2021 14:14
Add Swifty wrappers for socket and IP addresses
* Implement UNIX domain socket addresses

* Implement standalone SocketAddress.Family enum

* Add availability comments

* [WIP] Implement sendmsg/rcvmsg

* SocketAddress: Use _RawBuffer.

* ControlMessageBuffer: Fix issues uncovered in testing

* Enums: use mutable rawValues

This helps simplify code that needs to update these directly.

* Update/simplify sendmsg/recvmsg implementations

* Mocking: Add support for wildcard argument matching

* Add some AncillaryMessageBuffer tests; fix issues.

* Apply suggestions from code review

Co-authored-by: Michael Ilseman <[email protected]>

* Make mock failure output a little easier to follow

* Implement support for getaddrinfo

* Add an executable with very simple sample code

* Add availability marker for new CInterop typealiases

* Do not use implicitly unwrapped optionals

* Rework sendmsg/recvmsg, make some samples

* Updates/cleanups/discussion results

Co-authored-by: Michael Ilseman <[email protected]>
milseman and others added 5 commits April 10, 2021 14:16
* Update getsockopt/setsockopt

* Add getsockopt/setsockopt docs

* Doc updates
* Update getsockopt/setsockopt

* Add getsockopt/setsockopt docs

* Doc updates

* Add overloads for bind/connect taking concrete address types

* ShutdownKind: Remove Codable conformance.

* Listen: close the client connection socket before exiting

This is supposed to demonstrate acceptable use, we can’t leave resource cleanup to exit()

* Make IPv4.Address and IPv6.Address expressible by string literals

* Document SocketAddress better.
@milseman milseman force-pushed the network_receptacles branch from 42b8e22 to 2b0b28c Compare April 10, 2021 20:28
@milseman
Copy link
Contributor Author

Rebased, fixed conflicts, and then extracted everything into a separate SystemSockets module. Still need to clean up some of the shared code, probably do a separate test target, and figure out how to get mocking hooked up.

@karwa
Copy link

karwa commented May 13, 2021

This would be great!

It's really awkward to use the system's socket API via the clang importer - for instance, it seems like every system uses a different name for the union property of an in6_addr; on Darwin it's __u6_addr, Glibc uses __in6_u, Windows does something different as well, but I can't figure out what...

FYI, WebURL includes IPv4 and v6 address types, with parsing and serialisation in Swift. The parsers support throwing errors, and are compatible with the obscure formats supported by inet_aton.

I'd be happy to contribute those to swift-system. IMO, it's nicer if you can avoid platform dependencies, because you can guarantee more consistent behaviour, and there is not really anything about IP addresses which should be specific to any system. The implementation is fairly well isolated to a single file, outside of a couple of ASCII constants and some pointer utilities for using tuples as fixed-size arrays.

@Lukasa
Copy link

Lukasa commented May 14, 2021

I’m not speaking for @milseman here, but while I think good IP address types are very valuable, they probably don’t belong in Swift System. In many ways, Swift System is not trying to paper over the differences between OSes, but is instead exposing them.

However, I definitely think some suitable currency types here would be valuable. They may belong in the standard library more than they belong here. Relatedly, currency types for the wider “sockaddr” notion would also be useful, as those types are drastically more complex than an IP address alone.

@karwa
Copy link

karwa commented May 14, 2021

Well, the API roadmap says:

System aims to provide API in 3 areas:
...
2. Common types for library API at the systems level
System hosts common types for systems level programming, enabling libraries built on top of System to use the same, well-> crafted types in their API.

So that's why I thought they might fit here; papering over OS differences is just a bonus. If I understand you correctly @Lukasa, you're saying that there may be room for a library below swift-system (maybe the stdlib), containing currency types defined by e.g. networking standards? That's an interesting idea.

@Lukasa
Copy link

Lukasa commented May 14, 2021

It does, but the README also says:

Multi-platform not Cross-platform

System is a multi-platform library, not a cross-platform one. It provides a separate set of APIs and behaviors on every supported platform, closely reflecting the underlying OS interfaces. A single import will pull in the native platform interfaces specific for the targeted OS.

Our immediate goal is to simplify building cross-platform libraries and applications such as SwiftNIO and SwiftPM. System does not eliminate the need for #if os() conditionals to implement cross-platform abstractions, but it does make it safer and more expressive to fill out the platform-specific parts.

Naturally these two ideas are in tension, but I think it would be good to holistically consider whether swift-system is the right place to define an IPAddress type, or whether it's simply somewhere that should use it.

@milseman
Copy link
Contributor Author

milseman commented Oct 3, 2021

I do want to resuscitate this and figure out how to move forwards.

@karwa thanks for the offer for IP address parsing code. Is this a desire to avoid making a syscall, or are there concerns about platform availability? Are you able to support more API with a native implementation? We have construction of address from string via pton et al.

System is surfacing platform differences, but in a way that should make it easier to write platform-conscious code (including working in portable subsets). Sort of like the P is POSIX, it should be possible to do (even if not ideally) without crazy dangerous unsafe pointers. At the same time, some concepts like FilePath are so important and interwoven with libraries that do wish to work with portable interfaces that we invest significant amounts of library code and API design into supporting.

@colemancda
Copy link
Contributor

I am currently working on the same thing, I am making the Socket address, options and family type-safe wrappers over the C API using generics.

https://github.com/PureSwift/swift-system/tree/feature/network

@colemancda
Copy link
Contributor

I wanted to propose some API design changes based on my work to open a discussion on the merits of each:

  • Syscalls.swift should be the only file to import the C stdlib (glibc) functions and expose them internally.
  • CInterop.swift should be the only file to import the C stdlib (glibc) types and expose them publicly.
  • Constants.swift should be the only file to import the C stdlib (glibc) constants and expose them internally.
  • SocketAddressFamily RawRepresentable struct for _AF_INET definitions (similar to FilePermissions structure).
  • SocketType RawRepresentable struct for _SOCK_STREAM definitions (similar to FilePermissions structure).
  • SocketOptionLevel RawRepresentable struct for _SOL_SOCKET definitions (similar to FilePermissions structure).
  • SocketOptionID protocol with concrete enums implementing the protocol to wrap _SO_KEEPALIVE and additionally provide a type-safe enforced SocketAddressFamily constant (static var) when passing the value to setsockopt(). This allows types for Bluetooth on Windows and Netlink on Linux to be declared in a separate module (or package) while taking advantage of the generic methods on FileDescriptor. PureSwift's BluetoothLinux and Netlink packages have been tested in production for years and can benefit from this by consolidating a lot of C wrapper code by just adopting a simple protocol, contributing to smaller binaries on embedded systems and prevent bugs due to code duplication.
  • IPAddress RawRepresentable enum with concrete types for cases IPv4Address to publicly expose the inet_pton and inet_ntop functions and _INADDR_LOOPBACK constants. This design also promotes using values on stack when possible and computing strings when a user readable representation is needed, further decreasing usage of ARC. Similar to the implementation in this PR, a withUnsafeBytes() method is provided for C interoperability, but using a coding style and naming scheme familiar to users of the Swift stdlib. A big difference with this implementation is the decision to use String as RawValue instead of the underlying C type, instead making that value internal and only publicly exposed via withUnsafeBytes(), I made this to promote the trend of using Swift stdlib types preferred over C stdlib types where possible. An important use case that should highlight why we should try to "hide" the underlying C types from the type system where possible is the scenario where another module is using generics or protocol extensions on RawRepresentable, and then needing to import Darwin or Glibc since we made that a requirement and publicly exposed to the type system.
  • SocketOption protocol implemented via enums with associated values for the various socket options. The main reason for this is for the benefit of getsockopt. The returned value will be the enum and a finite list of options to enumerate over and safety unwrap the underlying value (e.g. .debug(Bool)). Another potential design had SocketOption itself with concrete type for each individual option, instead of an enum per Socket Option Level. Besides the increased code size and type "littering" due to needing a struct for every option, when the value is retrieved via getsockopt, instead of using generics to get a concrete type (enabling compiling optimizations) we would have to attempt to cast the protocol witness table to its concrete type. While setsockopt could just accept some concrete socket option value type that implements a protocol to provide the constants needed (e.g. _SOL_SOCKET, _SO_KEEPALIVE`) and the underlying C value, when retrieving a socket option, we would have a less type-safe result.
  • SocketAddress protocol with structs implementing the socket address for each protocol. The underlying C value is accessed only via withUnsafePointer().

I think the main take away is taking more advantage of generics for compiler optimizations and extending usage outside of this package, and preferring concrete types and enums with associated values over Any or a protocol container. When possible, associated types in public protocols should use Swift stdlib types and not CInterop types , instead preferring those for internal usage where possible. Underlying C structs are not public properties or RawValue but instead only accessed via withUnsafePointer() and optimized for the use cases when interacting with C. RawRepresentable structs with RawValue that are integers (e.g. mode_t) are fine, its only C structs and unions that are discouraged from being public properties and propagated in the public Swift type system. Extending C types to retrofit them to Swift (e.g. CustomStringConvertible) should be discouraged and instead new Swift wrappers like FilePermissions should be used. Each family of socket protocols (e.g. Unix, IPv4, Netlink) should have their own concrete types conforming to common protocols and the methods on FileDescriptor that interact with them should use generics and associated types to provide the necessary metadata and avoid manually specifying the socket address family.

@Lukasa
Copy link

Lukasa commented Oct 8, 2021

A big difference with this implementation is the decision to use String as RawValue instead of the underlying C type

I think it's a mistake to say that IP addresses have a "raw" value of String. I think removing the underlying C type as a raw value is laudable, but we shouldn't add String there: we should just not have the type be RawRepresentable at all.

SocketOption protocol implemented via enums with associated values for the various socket options.

enums here are very dangerous because they cannot be evolved: adding new enum cases is an API breaking change. We'll consistently need to add new types any time someone wants to add their pet socket option to the API, which is a bit of a mess. Given that socket options are unbounded anyway, we may just want a large family of structures instead.

Underlying C structs are not public properties or RawValue but instead only accessed via withUnsafePointer() and optimized for the use cases when interacting with C.

Presumably we want to be able to construct these values from their C representations? I don't see any particular harm in being able to use initialisers to transform back and forth.

@colemancda
Copy link
Contributor

I think it's a mistake to say that IP addresses have a "raw" value of String. I think removing the underlying C type as a raw value is laudable, but we shouldn't add String there: we should just not have the type be RawRepresentable at all.

Thats fair, I guess its a personal preference but I do see the logic of what you are saying. My logic was that since its the preferred way to construct the type (with the C type also able to instantiate it, but via more verbose unsafe initializer), then for the end user the fact we are not moving a String around (to avoid ARC) and converting it at the last minute to C types is merely an optimization and implementation detail. Most usage of this API in Swift will be creating the address from String and most of the usage of the C struct will predictably be used with interfacing with C stdlib, which this module is trying to wrap and discourage direct usage. If we remove RawRepresentable then rawValue could be replaced with description but I dont find a better candidate for init?(rawValue: RawValue). @Lukasa What would your suggestion be to replace that in a way that plays nicely with the Swift stdlib and is idiomatic (and doesn't reinvent the wheel)?

enums here are very dangerous because they cannot be evolved: adding new enum cases is an API breaking change. We'll consistently need to add new types any time someone wants to add their pet socket option to the API, which is a bit of a mess. Given that socket options are unbounded anyway, we may just want a large family of structures instead.

I am aware of the ABI aspect and I think we should freeze the enums we ship with this library. I think we need to weigh the pros and cons of having a couple dozen structs littering the type system for each option (and the performance impact of switching enums with associated values vs casting protocol containers), vs making a concrete enum of each family of socket options and the (IHMO small) danger of ABI breakage. If we are using the C / POSIX socket options (e.g. _SO_KEEPALIVE) its gonna change as much as POSIXError will in Foundation, which is pretty much never. I imagine the Darwin team at Apple has an idea if they plan to extend the POSIX socket options in the near future, but I would bet against it. As far as third parties implementing SocketOptionID and SocketOption protocols, their code will benefit from a generic setsockopt that accepts their concrete types, and keeping their ABI stable is possible but ultimately their responsibility, it doesn't affect the ABI of this module. They are not forced to use enums with associated values, but for Unix Sockets, IPv4, and Netlink and Bluetooth on Linux, I dont see the problem with freezing that ABI. Again I point to POSIXError in Foundation for Linux, Windows and Darwin (which I contributed to) as examples for using enums for C constants.

Presumably we want to be able to construct these values from their C representations? I don't see any particular harm in being able to use initialisers to transform back and forth.

Yes, but we want to make it as verbose as all the other unsafe APIs in Swift (e.g. withUnsafePointer()), not a simple .init(rawValue: sockaddr_in6).

@Lukasa
Copy link

Lukasa commented Oct 8, 2021

If we remove RawRepresentable then rawValue could be replaced with description but I dont find a better candidate for init?(rawValue: RawValue). @Lukasa What would your suggestion be to replace that in a way that plays nicely with the Swift stdlib and is idiomatic (and doesn't reinvent the wheel)?

init(string:) is fine.

Concretely, my objection here is that "127.0.0.1" isn't the raw representation of the IP address that it represents: UInt32(0x7F000001) is. "127.0.0.1" is one of many possible string representations of the same IP address. Users won't believe we're just passing strings around, because IP addresses aren't strings: they're numbers. This is the reason not to move Strings around: it has nothing to do with ARC, and everything to do with representing what an IP address actually is.

I am aware of the ABI aspect and I think we should freeze the enums we ship with this library.

That's good, because we don't have a choice: Swift packages literally cannot have open enums in them.

Again I point to POSIXError in Foundation for Linux, Windows and Darwin (which I contributed to) as examples for using enums for C constants.

POSIXError is a struct: https://github.com/apple/swift-corelibs-foundation/blob/599c05d83183454bea653b8843a9e26ca84f4a4c/Sources/Foundation/NSError.swift#L980.

Additionally, socket options in your design aren't constants: they have associated values, because they encode the type of the value. This is a good part of your design, and I want to see it persist: dealing with non CInt-sized socket options is painful. I'm just saying that enums is not a good fit for this design: structs are a perfect fit.

We don't have to "litter the namespace" with them: you can define them in an uninhabited enum and use that to namespace them. They just shouldn't be cases.

Yes, but we want to make it as verbose as all the other unsafe APIs in Swift (e.g. withUnsafePointer()), not a simple .init(rawValue: sockaddr_in6).

Why? It's not unsafe.

@colemancda
Copy link
Contributor

colemancda commented Oct 8, 2021

@Lukasa I think you misunderstood my pitch for SocketOption, its a protocol with concrete implementations for each protocol and socket level combinations. Third parties can create their own types that implement the protocol and they can be structs for each option, or enums with associated values. Whatever implementation they want in their code is fine, so long as they implement the required protocol. Even if you wanted to you could in theory extend SOL_LOCAL with your own options, although not sure the Kernel would accept that.

    @_alwaysEmitIntoClient
    public func setSocketOption<T: SocketOption>(
        _ option: T,
        retryOnInterrupt: Bool = true
    ) throws

    @usableFromInline
    internal func _setSocketOption<T: SocketOption>(
        _ option: T,
        retryOnInterrupt: Bool
    ) -> Result<(), Errno> {
        nothingOrErrno(retryOnInterrupt: retryOnInterrupt) {
            option.withUnsafeBytes { bufferPointer in
                system_setsockopt(self.rawValue, T.ID.optionLevel.rawValue, option.id.rawValue, bufferPointer.baseAddress!, UInt32(bufferPointer.count))
            }
        }
    }

    @_alwaysEmitIntoClient
    public func getSocketOption<T: SocketOption>(
        _ option: T.ID,
        retryOnInterrupt: Bool = true
    ) throws -> T

@colemancda
Copy link
Contributor

@Lukasa I will try your proposal for struct for socket options, thanks for your feedback. Feel free to keep an eye on my branch, its too WIP to open a PR yet, and honestly there is a lot of code I wish I just copied from this branch to save hours of my time.
With regards to POSIXError, I meant the POSIXErrorCode which is not defined in Foundation but in Swift.

https://github.com/apple/swift/blob/f34e3214449b2f16e31bd6b907dd08665f0c85fc/stdlib/public/Platform/POSIXError.swift#L269

@Lukasa
Copy link

Lukasa commented Oct 8, 2021

I don't think I did.

Here's the code directly from your module (as of eba8454e):

public protocol SocketOption {
    
    associatedtype ID: SocketOptionID
    
    var id: ID { get }
    
    func withUnsafeBytes<T>(_: ((UnsafeRawBufferPointer) -> (T))) -> T
}

public enum GenericSocketOption: SocketOption, Equatable, Hashable {
    
    public typealias ID = GenericSocketOptionID
    
    case debug(Bool)
    case keepAlive(Bool)
    
    @_alwaysEmitIntoClient
    public var id: ID {
        switch self {
        case .debug: return .debug
        case .keepAlive: return .keepAlive
        }
    }
    
    @_alwaysEmitIntoClient
    public func withUnsafeBytes<T>(_ pointer: ((UnsafeRawBufferPointer) -> (T))) -> T {
        switch self {
        case let .debug(value):
            return Swift.withUnsafeBytes(of: value.cInt) { bufferPointer in
                pointer(bufferPointer)
            }
        case let .keepAlive(value):
            return Swift.withUnsafeBytes(of: value.cInt) { bufferPointer in
                pointer(bufferPointer)
            }
        }
    }
}

My point here is that GenericSocketOption is not something that this library can ever evolve. We will get one shot to define the type, and then it'll be frozen in time forever. This requires us to audit for essentially all possible socket options associated with SOL_SOCKET on supported platforms and define as many of them as possible, or we'll lose the ability to refer to them by a shorthand. If we ever did want to add something to this library that added more options for SOL_SOCKET, we'd have to define a brand new type for them.

As to POSIXErrorCode, enums in the standard library can add new cases over time, so my objection does not apply to them. POSIXErrorCode, notably, is not a frozen enum.

@colemancda
Copy link
Contributor

colemancda commented Oct 8, 2021

@Lukasa So, I agree that using structs for SocketOption (encapsulating values) will work better than enums with associated values (no casting or switching), what about the SocketOptionID? I think for our usage its fine if we use enums, but I will admit that maybe for it to "fit in" with the other constants, we should make it RawRepresentable structs. The same question applies to SocketProtocol, which is a design I have already been using for years for Bluetooth on Linux and Netlink.

@Lukasa
Copy link

Lukasa commented Oct 8, 2021

For SocketOptionID I'd lean towards the extensible option of a RawRepresentable struct, but here I do think there is some scope to assume the shape won't change much.

Do you have a link to SocketProtocol?

@colemancda
Copy link
Contributor

SystemPackage:
https://github.com/PureSwift/swift-system/blob/feature/network/Sources/System/SocketProtocol.swift
Previous usage:
https://github.com/PureSwift/BluetoothLinux/blob/master/Sources/BluetoothLinux/BluetoothProtocol.swift
https://github.com/PureSwift/Netlink/blob/master/Sources/Netlink/SocketProtocol.swift

I know it's a small optimization, but the enum will use one byte on the stack vs Int32, and more than that I really like enums for constants (due to switching), especially if they are tied to C std lib or kernel drivers I know will not change for the foreseeable future. It just makes it a tiny bit safer but not allowing invalid constants.

@colemancda
Copy link
Contributor

colemancda commented Oct 8, 2021

Just to provide some context, we have been using L2CAP and HCI Bluetooth sockets on Linux for Bluetooth LE (Client and Server) to interact with the Linux BlueZ subsystem via Swift without the C userland library (due to licensing and bugs) and Netlink (with Codable serializer) as a replacement for CoreWLAN on Linux. Outside of ioctl and the POSIX Socket API we are directly communicating from Swift to the Linux kernel without wrapping C Userland libraries. This has been working in production since 2016 on Armv7 embedded devices (IoT and Home Automation products), so I am very investing in developing to this project as a kind of "Foundation for Glibc" or revival of my SwiftFoundation project (Before the great Swift 3 rename and Foundation Value Types), except limiting it to smaller scope of providing idiomatic and lightweight Swift APIs for C / POSIX functions, and avoid importing Glibc and Darwin directly in my own low level system libraries (or really ever in the future).

Other parts of C stdlib / POSIX API this framework should implement:

Also Foundation (really CoreFoundation) breaks every couple releases on Linux and 32 bit platforms in general, so the more #PureSwift libraries we can have, the more we can use Swift (instead of Rust) for embedded ARM development.

@karwa
Copy link

karwa commented Oct 11, 2021

@karwa thanks for the offer for IP address parsing code. Is this a desire to avoid making a syscall, or are there concerns about platform availability? Are you able to support more API with a native implementation? We have construction of address from string via pton et al.

So the offer was based on:

  1. The fact that I already wrote it, and test it as being compatible with pton/ntop
  2. The fact that IP addresses, as well as how to parse/serialise them, are well-defined by networking standards. They aren't system-specific, but they obviously come up in a lot of systems-level networking code.
  3. A native implementation can support generics. Not sure how useful that is generally, but in WebURL we want to allow for parsing a lazily-filtered string.

As for why we should have native IP address types in general:

  1. The C API is awful. For instance, if you want to view the octets of an IPv6 address (which is the best way to handle all IP addresses), you use access a different member on each platform: in6_addr.__u6_addr.__u6_addr8 on Darwin, in6_addr.__in6_u.__u6_addr8 (different union name) on GlibC, and in6_addr.u.Byte on Windows (🤷‍♂️). This is accompanied by a #define s6_addr <platform-specific-name> to give it the correct name, but the clang importer just chuckles at that and moves on.

  2. Address endianness can be better explained by a better API. In WebURL, we define that the internal storage has network-order (that's its "binary" representation, numeric value depends on how your machine arranges the bytes), and provide accessors which convert to host-order/"numeric" representations with consistent numeric value. I think it works better to explain what's going on, and since Swift has paired get/set accessors, the conversion to/from that representation happens automatically. We take those accessors for granted in Swift, but it's still a serious improvement over C.

That being said, I think @Lukasa's idea about this being in a library below swift-system is very attractive. They don't require any system functionality, so a new platform that doesn't support swift-system (e.g. webassembly) shouldn't be excluded from them. At the same time, they don't expose any system functionality, so if there was a theoretical port of swift-system to webassembly, it would have to include more than just IP addresses and any other such types.

@karwa
Copy link

karwa commented Oct 11, 2021

Thats fair, I guess its a personal preference but I do see the logic of what you are saying. My logic was that since its the preferred way to construct the type (with the C type also able to instantiate it, but via more verbose unsafe initializer), then for the end user the fact we are not moving a String around (to avoid ARC) and converting it at the last minute to C types is merely an optimization and implementation detail. Most usage of this API in Swift will be creating the address from String and most of the usage of the C struct will predictably be used with interfacing with C stdlib, which this module is trying to wrap and discourage direct usage

You don't have to construct an IP address from a String. With the WebURL IPv4Address type you can also write:

let addr = IPv4Address(octets: (127, 0, 0, 1))

This has some uses for addresses you know at compile-time (loopback addresses, perhaps special services). Since it doesn't require parsing, it isn't a failable initialiser. IPv6 addresses offer the same thing, but they're easier to get wrong because they're bigger.

@milseman milseman mentioned this pull request Nov 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.