Query about Text Preprocessing (Encoding)
On 29/05/2016 3:20 AM, Khadija Shakeel wrote:
i want to work with Urdu language but R is only displaying Urdu text but cant work with Urdu text. Actually I want to apply preproessing steps of text mining. but R is nor responding for this text. Help me how can I handle this problem? here are some pictures of word cloud of Urdu text.
R doesn't currently have a translation team (see translation.r-project.org) for Urdu, so it may be hard for you to get Urdu-specific support. However, I would guess the problems you are having are common to other languages that use non-Roman alphabets, and you may get some advice from the translation teams for one of them. The general issues that I know of are: - R needs to know your encoding. On Unix-alikes the best support is for UTF-8; Windows support is weaker, because Windows tends to use UTF-16 or other multibyte encodings, and R's support for those is mixed. - You need to make sure your graphics device supports your alphabet. Not all graphics devices have character support for all languages. Duncan Murdoch