Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Every Time It Returns only absent() #98

Open
hattewarsm opened this issue Jan 7, 2019 · 2 comments
Open

Every Time It Returns only absent() #98

hattewarsm opened this issue Jan 7, 2019 · 2 comments

Comments

@hattewarsm
Copy link

No description provided.

@stefan-reich
Copy link

stefan-reich commented Jan 23, 2019

You might want to give some more info 😃

@james-s-w-clark
Copy link

@hattewarsm are you referring to something like:

        List<LanguageProfile> languageProfiles = new LanguageProfileReader().readAllBuiltIn();
        LanguageDetector detector = LanguageDetectorBuilder.create(NgramExtractors.standard())
                .withProfiles(languageProfiles)
                .build();

        Optional<LdLocale> detected = detector.detect("コンコルド001試作機は1969年3月2日にトゥールーズで初飛行した");

and detected has value Optional.absent()?

I tested a few more examples:

  • hello -> absent
  • hello world, how are you doing? -> absent
  • hello world, how are you doing? This string is obviously English! -> Optional.of(en)

This detector requires the most confident language detected to have >= 0.9999 confidence. This does seem rather high. Confidence below this returns Optional.absent().

You may be better off using detector.getProbabilities and taking the most confident language (.get(0) - they're sorted).

If this isn't the case, I think you'd have to give more information for the ticket not to be rejected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants