-
Notifications
You must be signed in to change notification settings - Fork 163
exotic characters handling #53
Comments
(I remember that I've sent an email some time ago on the mailing list about a related issue, but this issue was tested with the uptodate github version). |
Hi Joel ... This worked for a while and re-broke w/ the conversion to httr. I got it working "better" w/ httr last weekend, but still working on it. |
OK, good to know there's nothing wrong on my side at least ;-) On Mon, Mar 3, 2014 at 6:09 PM, Jeff Gentry [email protected]:
Joël Gombin Doctorant en science politique / PhD candidate in political science 277, rue du Faubourg Saint-Antoine |
Nope. Even pre-httr there were characters that would slip through, and perhaps the current point is the same as it was before. The forceUtf8Conversion is removed as everything is being forced to UTF-8 in the httr call. FWIW, I'm doing this: fromJSON(content(response, as="text", encoding="UTF-8")) Which may or may not be the best thing to do w/ the JSON response in terms of getting the best results from httr. |
It seems that when a status contains some "exotic" (non alphanumeric) character, the twitteR functions may not be able to handle them correctly (at least to the best of my understanding, which remains limited).
An example:
showStatus("439748835238510592")
returns
"AsapVoltaire: RT @CoralieDji: \"Tu penses quoi de Beyoncé ?\" \n\"Je pense que Dieudonné a sa place la dedans\" - @AsapVoltaire \xed\xa0\xbd\xed\xb8\x82\xed\xa0\xbd\xed\xb8\x82\xed\xa0\xbd\xed\xb8\x82"
Is there any way to postprocess this? (by the way I tried the
forceUtf8Conversion=TRUE
option, but it throws:Error in tw_from_response(out, ...) : unused argument (forceUtf8Conversion = TRUE)
)If there's no way to fix this, I think we should at least filter this out, because then it throws errors with further treatments.
The text was updated successfully, but these errors were encountered: