Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

include/libzutil.h: fix misuse of strerror_l() #16916

Closed
wants to merge 2 commits into from

Conversation

r-ricci
Copy link
Contributor

@r-ricci r-ricci commented Jan 2, 2025

According to the POSIX man pages:

"The behavior is undefined if the locale argument to strerror_l() is the special locale object LC_GLOBAL_LOCALE or is not a valid locale object handle."

Since uselocale() could return LC_GLOBAL_LOCALE, the value must be checked before passing it to strerror_l().

This fixes a segmentation fault on systems with the musl libc.

How Has This Been Tested?

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

According to the POSIX man pages:

"The behavior is undefined if the locale argument to strerror_l() is the
special locale object LC_GLOBAL_LOCALE or is not a valid locale object
handle."

Since uselocale() could return LC_GLOBAL_LOCALE, the value must be
checked before passing it to strerror_l().

This fixes a segmentation fault on systems with the musl libc.

Signed-off-by: Roberto Ricci <[email protected]>
Signed-off-by: Roberto Ricci <[email protected]>
@r-ricci
Copy link
Contributor Author

r-ricci commented Jan 2, 2025

The segmentation fault on musl systems occurs during the boot process, when zfs share -a is invoked, with zfs 2.2.7.

The same command works fine with zfs 2.2.6 and the current master branch.

@amotin
Copy link
Member

amotin commented Jan 2, 2025

It seems to be a fallout from #15793. @rkojedzinszky @mcmilk ^^^. I don't know this area, but I suspect that uselocale() might be wrong, and the proposed patch is only a workaround, possibly incorrect, since it returns non-thread-safe strerror().

@amotin amotin added the Status: Code Review Needed Ready for review and testing label Jan 2, 2025
@rkojedzinszky
Copy link
Contributor

The original intent here was to make zfs_strerror() a thread-safe function. Now as it has faults, and some of the fixes make the code behave the original way (non-thread safe), I think there is no point to keep this function as is right now. Either we should revert the first commit, or fix in a way that it really becomes thread safe.

@r-ricci
Copy link
Contributor Author

r-ricci commented Jan 2, 2025

the proposed patch is only a workaround, possibly incorrect, since it returns non-thread-safe strerror().

You are right, after all the whole point of zfs_strerror() is to be a thread-safe wrapper.

I tried to read the documentation more carefully.
I think that the use of setlocale() in cmd/mount_zfs.c, cmd/zfs/zfs_main.c and cmd/zpool/zpool_main.c is wrong, because if you query the locale using uselocale(0), you must have previously set it with uselocale(some_locale), not setlocale().
setlocale() sets or gets the program's global locale, uselocale() sets or gets the current thread's locale.
Currently uselocale(0) will always return LC_GLOBAL_LOCALE, even with glibc.

The correct use of these interfaces should be something like this:

locale_t loc, ret;

// create a new locale object
loc = newlocale(LC_ALL_MASK, "", (locale_t) 0);
if (loc == (locale_t) 0) {
    // error
}
ret = newlocale(LC_NUMERIC_MASK, "C", loc);
if (ret == (locale_t) 0) {
    freelocale(loc);
    // error
}
loc = ret;

// set the locale
ret = uselocale(loc);
freelocale(loc);
if (ret == (locale_t) 0) {
    // error
}

// query the locale
loc = uselocale(0);
if (loc == LC_GLOBAL_LOCALE) {
    // error
} else if (loc == (locale_t) 0) {
    // error
} else {
    strerror_l(errnum, loc);
}

My patch is incomplete also because uselocale(0) could return 0 too, which is another invalid value for strerror_l().

@r-ricci r-ricci marked this pull request as draft January 2, 2025 18:03
@github-actions github-actions bot added Status: Work in Progress Not yet ready for general review and removed Status: Code Review Needed Ready for review and testing labels Jan 2, 2025
@rkojedzinszky
Copy link
Contributor

rkojedzinszky commented Jan 2, 2025

Reading opengroup specs further, it is also possible that strerror() itself returns an error. strerror_l() can even return null. https://pubs.opengroup.org/onlinepubs/9799919799/functions/strerror.html#

@r-ricci
Copy link
Contributor Author

r-ricci commented Jan 2, 2025

The correct use of these interfaces should be something like this:

Or, after setting the locale with setlocale(), duplocale(LC_GLOBAL_LOCALE) can be used to create a locale_t object which duplicates the global locale, and this can be passed to uselocale() to set the current thread's locale.

@rkojedzinszky
Copy link
Contributor

I've started reading actual libc implementation. Very simple, at least in FreeBSD: https://github.com/freebsd/freebsd-src/blob/main/lib/libc/string/strerror.c#L125.

What if we simply create a thread local buffer (the same in with strerror_l()), call strerror() with a mutex held, and copying its result to the thread-local buffer, then release the mutex? That would preserve original strerror() behavior truly, and we could get thread-safety as well.

@rkojedzinszky
Copy link
Contributor

Something like this?

/*
 * Thread-safe strerror() for use in ZFS libraries
 */
static inline char *zfs_strerror(int errnum) {
	static __thread char buf[2048];
	static __thread pthread_mutex_t buf_lock = PTHREAD_MUTEX_INITIALIZER;

	pthread_mutex_lock(&buf_lock);
	strlcpy(buf, strerror(errnum), sizeof(buf));
	pthread_mutex_unlock(&buf_lock);

	return buf;
}

@behlendorf
Copy link
Contributor

Since we already depend on thread local variables in a few places that's probably a workable way to handle this.

@behlendorf
Copy link
Contributor

Resolved. Applied the change from #16923 for this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Work in Progress Not yet ready for general review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants