Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

facil.io update #140

Open
boazsegev opened this issue Nov 10, 2024 · 5 comments
Open

facil.io update #140

boazsegev opened this issue Nov 10, 2024 · 5 comments

Comments

@boazsegev
Copy link

Dear Rene (@renerocksai),

I don't know if you're aware, but I am slowly progressing with the new version for the facil.io library.

The logic moved to the facil.io C STL and the 0.8.x framework would be a wrapper around this library.

This is a total rewrite with support for Windows as well as changes to the core architecture.

For example, the IO is now concentrated in a central "master" IO thread and any calls to fio_write by the worker threads are re-routed to the IO thread. This should improve multi-threading performance and allow the IO to continue even when all worker threads are blocked on database calls or other tasks.

There are also more features (improved HTTP / WebSocket client support and automatic HTTP Streaming for HTTP/1.1) which the new design tempted me to implement.

There's yet work to be done, but I think all of the features from the facil.io 0.7.x version (except for the Redis support) are already implemented, so it's possible to start the transition.

It's important for me to keep supporting Zap in everything that you need. So if there are any features / specific needs, please let me know.

Cheers.

@renerocksai
Copy link
Member

Hi Bo,

Wow!!! That's amazing news! I was aware you were working on it and occasionally checked out your GitHub. But I wasn't aware it has reached a state where starting the transition became feasible!

I am extremely busy these days, but I will try to make time when things calm down around the Christmas / new year season.

So far, what I read sounds really good!

I don't care about redis TBH, so that's not an issue for me.

What comes to mind regarding your new design, is: it might lend itself nicely to io_uring as a potential option for Linux. But let's not get ahead of ourselves 🤣.

I'll try to check out the repo(s) to estimate how to go about it.

It would be really amazing to have a new improved facil.io that also supports Windows, chunked responses, and decoupled worker & IO threads!!!!

Thanks for your great work!

Best,
Rene

@renerocksai
Copy link
Member

renerocksai commented Nov 13, 2024

Oh. One MAJOR wish from my side. If it were at all possible to pass an allocator to facil.io, that would be amazing. In Zig, we really love to use our specific allocators. I vaguely remember having seen temp allocations in facil.io, which is already great.

So, if there was a way to pass a "callback" (maybe there is a more efficient way than going through an indirection) for malloc, realloc, free - that would be great. Then zap users could benefit from sticking their own allocators in.

Probably a bit more complicated - but I am only dumping wishes here - would be to somehow return from functions when an allocation failed. E.g. you call "malloc", it returns out of memory, and you return "out of memory" from the function that called malloc. You probably only need to return the error if you can't take care of it in the library or when it makes sense to let the user code know.

Actually, memory management in facil.io is pretty advanced already. I see you're using mmap and arenas. So, in wishful thinking mode, it would be great to be able to pass to facil.io an arena allocator struct / a set of functions (alloc, free, reset, ...). And an allocator struct (or set of functions) as the main allocator. And the icing on the cake would be to return out of memory errors to user code where appropriate.

Maybe one option is to have those allocators linked in optionally - but I'm just brainstorming here.

I am not sure, especially with returning errors, how well that concept would fit into facil.io. From what I can see, you're already taking care of "failed allocations" - which is the main point in Zig for returning out of memory errors. (Have the program = server not crash on a failed malloc()).

Let me know how open you are to suggestions like this. Maybe we can have a chat / call one day - or alternatively, when I get time to think more deeply about it, I can write up something more concrete.

Thanks for your amazing work anyway. facil.io has been a game changer in my server projects! BTW if "we" can make it even faster - O M G !!!

@renerocksai
Copy link
Member

renerocksai commented Nov 13, 2024

Ah, one more question (you asked for it 😄): I often get asked how to make http requests from zap. AFAICT and what my early experiments confirmed to me: facil.io really likes to be either server or client.

Do you plan to support "mixed mode" in the future? The decoupled IO might (mayyyyyyyybe) open up that possibility?

Use-case: make an outbound HTTP request from within a request handler. I know this comes with its own set of problems. E.g. you're blocking a handler thread on potentially slow network requests that can run into multi-second timeouts. So, IMHO, ideally, you'd want async http requests. Suspend the worker and wake it up when there is a response. But, with the new decoupled IO, I think user code could handle the async part. E.g. using a thread (from a pool) or even more sophisticated real async co-routine implementations (e.g. zigcoro). So no need to make it facil.io's problem. As long as there is a(n indirect) way to write to the response's connection socket, and signal when finished with the request, all the building blocks should be there to go crazy on (pseudo-) async stuff in request handlers.

To me, mixing server & client modes is not a super hard issue, as one can always link in libcurl or similar.

@boazsegev
Copy link
Author

boazsegev commented Nov 14, 2024

Hi Rene,

Thank you so much for your enthusiasm and suggestions 🙏🏻🥳

I'll answer the HTTP Client thing first (as it's a solved question) before diving into memory management and performance.


often get asked how to make http requests from zap

I think that using the new fio_http_connect would make this a breeze in some ways (and a headache in others), as it would allow the same server style callback approach to be used for client connections.

make an outbound HTTP request from within a request handler ... ideally, you'd want async http requests.

The request is already async, but it would be easy to mixup the HTTP handles (the client handle and the server handle). It would also be important to store the client handle somewhere (probably using .udata = fio_http_dup(h) when establishing the client connection).


pass an allocator to facil.io

Wow, that's a big one.

Right now facil.io provides macros that allow developers to choose their allocators during compile-time. See the "Default Memory Allocation" section in the documentation.

However, once FIO_MEM_REALLOC and FIO_MEM_FREE were defined for a specific type / module during compile-time, these are impossible to change dynamically.

I assume we could have FIO_MEM_REALLOC route to a function pointer that could be controlled post-compilation. However, this would still be limited on a per type / module basis. Moreover, this may cause issues when switching allocator implementation after an allocation already occurred (as the wrong free function will be used).

We should definitely brain-storm possible approaches.

the icing on the cake would be to return out of memory errors to user code

I actually believe that it is better for a compromised server to crash rather than handle the error (as error handling itself may require memory).

The worker process sentinel approach was chosen so we could limit the crash to a specific process and release its resources back to the system – hopefully, freeing any leaked memory and limiting any effects to a small subset of active users (who should experience unexpected disconnections rather than timeouts).


BTW if "we" can make it even faster...

I'm not sure where the current bottlenecks might be and would love to push performance even further.

I suspect the queue implementation is one bottleneck when multi-threading, as well as the IO thread wake-up approach (using pipes to wake the main thread from epoll / poll / kqueue when events are pushed from non-IO worker threads)... but I find it difficult to figure out faster approaches.

Another possible improvement would be to combine the allocation of the IO object and its per-protocol memory requirements (currently stored in the IO's udata). This would possibly improve cache locality while also minimizing the number of allocations when establishing a new connection.

This would probably require (again) to slightly break / change the new API and re-design the server module. I might play around with this idea, but I'm not sure how that would be done just yet (possibly by adding an io_memory_size in the fio_protocol_s). This might also impose limitations on protocol switching (as the size cannot expend once allocated), but I think this could actually work.

Thanks again!
B.

@boazsegev
Copy link
Author

I broke a bunch of things and put them back together, but now we get the option to perform one less allocation when attaching a socket.

I still need to incorporate the change to the HTTP module and see if I can expand it to the TLS module as well... but I'm not sure how much of a performance gain this would have.

I did take the opportunity to rename the FIO_SERVER module to FIO_IO and fix the namespace consistency issue in that module. Now it should be clearer that this isn't only for Server operations (but also for client operations)...

... this does break the previous API, but the changes are basically a "find and replace" for:

"FIO_SERVER" => "FIO_IO"
"_srv_" => "_io_"
"fio_s " => "fio_io_s "
"fio_protocol" => "fio_io_protocol"
"fio_tls " => "fio_io_tls"
"fio_read" => "fio_io_read"
"fio_write" => "fio_io_write"
"fio_udata" => "fio_io_udata"
 "fio_io_udata_get" =>  "fio_io_udata"
 "fio_io_protocol_get" =>  "fio_io_protocol"

I think this list covers the changes to the FIO_IO module.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants