Skip to content
This repository has been archived by the owner on Apr 11, 2019. It is now read-only.

exotic characters handling #53

Open
joelgombin opened this issue Mar 3, 2014 · 4 comments
Open

exotic characters handling #53

joelgombin opened this issue Mar 3, 2014 · 4 comments

Comments

@joelgombin
Copy link
Contributor

It seems that when a status contains some "exotic" (non alphanumeric) character, the twitteR functions may not be able to handle them correctly (at least to the best of my understanding, which remains limited).

An example:
showStatus("439748835238510592")

returns

"AsapVoltaire: RT @CoralieDji: \"Tu penses quoi de Beyoncé ?\" \n\"Je pense que Dieudonné a sa place la dedans\" - @AsapVoltaire \xed\xa0\xbd\xed\xb8\x82\xed\xa0\xbd\xed\xb8\x82\xed\xa0\xbd\xed\xb8\x82"

Is there any way to postprocess this? (by the way I tried the forceUtf8Conversion=TRUE option, but it throws: Error in tw_from_response(out, ...) : unused argument (forceUtf8Conversion = TRUE))

If there's no way to fix this, I think we should at least filter this out, because then it throws errors with further treatments.

@joelgombin
Copy link
Contributor Author

(I remember that I've sent an email some time ago on the mailing list about a related issue, but this issue was tested with the uptodate github version).

@geoffjentry
Copy link
Owner

Hi Joel ...

This worked for a while and re-broke w/ the conversion to httr. I got it working "better" w/ httr last weekend, but still working on it.

@joelgombin
Copy link
Contributor Author

OK, good to know there's nothing wrong on my side at least ;-)

On Mon, Mar 3, 2014 at 6:09 PM, Jeff Gentry [email protected]:

Hi Joel ...

This worked for a while and re-broke w/ the conversion to httr. I got it
working "better" w/ httr last weekend, but still working on it.

Reply to this email directly or view it on GitHubhttps://github.com//issues/53#issuecomment-36532554
.

Joël Gombin

Doctorant en science politique / PhD candidate in political science
CURAPP - Université de Picardie Jules Verne

277, rue du Faubourg Saint-Antoine
75011 Paris
Tel : +33 (0)6 61 55 22 41
www.joelgombin.fr

@geoffjentry
Copy link
Owner

Nope. Even pre-httr there were characters that would slip through, and perhaps the current point is the same as it was before. The forceUtf8Conversion is removed as everything is being forced to UTF-8 in the httr call. FWIW, I'm doing this:

fromJSON(content(response, as="text", encoding="UTF-8"))

Which may or may not be the best thing to do w/ the JSON response in terms of getting the best results from httr.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants